Short tandem repeat (STR) population data

Samples stored on FTA paper were obtained from 150 father-son pair donors from paternity cases enrolled in Rio Grande do Sul, Brazil, a region that was essentially colonized by Portuguese, German, and Italian immigrants in the 18 century [1]. The biological relationship of all father-son pairs was previously confirmed by autosomal STR analysis (AmpFℓSTR® Identifiler® PCR Amplification Kit; Applied Biosystems; Life Technologies, USA), with paternity index values >10,000. The age of the father at the time of son’s birth was noted. All of the participants signed informed consent forms, and this study was carried out in accordance to approved ethical standards (CAAE 11096212.6.0000.5336; Protocol # 180.121). We used the nomenclature recommended by the DNA Commission of the International Society of Forensic Genetics [2], as well as the recommendations for publication of population data for forensic purposes [3].

Genomic DNA samples were purified from dried blood samples preserved on FTA cards (Whatman Bioscience, Cambridge, UK) and amplified using the PowerPlex® Y23 System kit (Promega Corporation, Madison, WI) which include 23 Y-STR loci: DYS19, DYS385a/b, DYS389I/II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, Y-GATA-H4, DYS481, DYS533, DYS549, DYS570, DYS576, and DYS643 [4, 5]. PCR products were separated and genotyped by capillary electrophoresis on an ABI PRISM® 3130xl Genetic Analyzer (Life Technologies) and analyzed by GeneMapper® ID software v3.2 (Life Technologies, USA). All mutational event cases were confirmed by re-analysis. The mutations were confirmed by a second typing by AmpFℓSTR® Yfiler® PCR Amplification Kit (Applied Biosystems; Life Technologies, USA) whenever the locus involved was present in this different system. A quality control check was performed using the proficiency testing of the Y-STR Haplotyping Quality Assurance Exercise. The 23 Y-STRs haplotype data for the sample of father-son pairs from Rio Grande do Sul (RS), Brazil, will be available in the YHRD database (http://www.yhrd.org). Allele and haplotype frequencies were determined using the software Arlequin, version v.3.5.1.3 [6]. Haplotype diversity (HD), power of matching (PM), power of discrimination (PD), and discrimination capacity (DC) were calculated as described elsewhere [7]. Y-STR haplotype frequencies and diversity were calculated treating loci DYS385a and DYS385b as a single haplotype, and locus DYS389I after subtracting the value of DYS389I. The age of the fathers (at the time of son’s birth) with and without mutations was compared by a Mann–Whitney U test.

A total of 150 father-son pairs with full profiles were analyzed in this study. Allele/genotype frequencies and gene diversity values calculated for each locus in the sons are listed in Supplemental Table 1 (the six new loci of the PowerPlex Y-23 kit are underlined and in gray box). The highest gene diversity was observed for the single locus marker DYS570 (GD = 0.7888) and for the two-locus system DYS385 (GD = 0.9009). We identified a total of 150 different haplotypes in a sample. The genotyping with the 23 Y-STR loci using PowerPlex® Y23 System kit resulted in a greater discrimination capacity when compared to the previously published studies in the same RS population with 11 Y-STR loci (DC = 84.0 %) [8], and with 17 Y-STR loci (DC = 96.9 %) [9]. Thirteen mutations were identified in the 3450 father-son allelic transfers, with an overall mutation rate across the 23 loci of 3.768 × 10−3 (95 % CI, 3.542 × 10−3 to 3.944 × 10−3). This rate varied from other Y-STRs studies in other populations as is showed in Supplemental Table 2, but this difference was not statically significant considering an overall proportion of 2.432 × 10−3 (P = 0.116, Binomial-Bilateral Test; see Supplemental Table 2). Based in the data from Supplemental Table 2, we compare the proportion of mutations per meiosis (father to son transmission of the whole Y chromosome) calculated by the number of mutations within the whole Y chromosome divided by the number of the father-son pairs analyzed; as expected, a larger number of mutations is encountered when the Y-STR system have more loci (Supplemental Fig. 1).

In the 23 Y-STR panel, we observed a not so different number of gains (N = 5) and losses (N = 8) of repeats in the son; this proportions were similar to global proportions of 55.2 % of gains (P = 0.225) and 44.8 % of losses (P = 0.350, Binomial-Bilateral Test; see Supplemental Table 2). Some variations among studies might be the result of differences in sample size and/or in the number of analyzed loci. In all cases, there was only one single locus mutated per pair with gain/loss of repeats in the son; five were gain of one-repeat (single-step), seven were loss of one-repeat, and in one case there was a two-repeat loss (locus DYS481) (Supplemental Table 3). Ninety-two percent (12/13) were mutations that resulted in one-repeat difference, which is consistent with the concept that due to strand slippage during replication, the majority of mutations comprise single-step repeat gain or loss [10].

In order to confirm the paternity of these 13 alleged father-son pairs, the AmpFℓSTR® Identifiler® PCR Amplification Kit (Applied Biosystems; Life Technologies, USA) results were reviewed. None of these father-son pairs was found to have observable de novo mutations with the 15 autosomal STRs. The combined paternity index (CPI) values for each sample pair using the 15 autosomal AmpFℓSTR® Identifiler® PCR Amplification Kit STR loci genotyping results considering father-son pair, as well as the CPI with mothers’ profile included were shown in Supplemental Table 3. In these cases, the CPI for father-son pair were above 4.2 × 102; and when the mother’s profile is included for each pair, the CPI was above 8.3 × 106. Supplemental Table 3 shows also the age of the fathers involved in the observed mutation events. No significant differences between mutation event and the father’s age at the birth of child was observed (P = 0.7970; Mann–Whitney U Test) since the average age of the fathers involved in mutation events was 27.8 years (SD = 8.21); and for no-mutation transmissions, the average was 29.0 years (SD = 9.07). A similar result was noticed in a study with 1766 father-son pairs (15,894 meioses) in nine Y-STR loci [11]. However, it is in contrast with a collaborative work with 3026 father-son pairs in 17 Y-STR loci [12] where the authors found that mutation rate increased with the age (P < 0.001; Mann–Whitney U Test) and with subsequent work that analyzed 1730 father-son pairs by AmpFℓSTR® 17Yfiler® PCR amplification kit [13]. Variations among these and other studies are in Supplemental Table 4 and might be explained by the differences in the number of Y-STR analyzed loci or by the still small group of fathers who transmit mutations (about 0.2 % of all evaluated men).

The observed mutation rates at 23 Y-STR loci were 6.7 × 10−2 (in DYS458, DYS358a/b, and DYS456), 1.3 × 10−3 (in DYS481, DYS570, and DYS390), and 2 × 10−3 in DYS576. In 16 Y-STR loci there were no mutations. Since mutation rate is believed to increase with the length of allele repeat size [10], we recorded the allele category size (the alleles were categorized into 25, 50, and 25 % quantiles for short, moderate, and long sizes, respectively); all mutations occurred in the progenitor alleles (father’s alleles) with moderate size. We observed no instances of mutations involving a non-integral number of repeats, like du- or triplication, deletion, micro-variant, etc. in our sample. Knowledge of the mutation rates of the loci described above will improve the exclusion criteria in paternity testing and the interpretation of the DNA profiles in the population of Southern Brazil. These data show that the present set of 23 Y-STR markers studied with PowerPlex® Y23 System kit is highly polymorphic and discriminative in the Brazilian southernmost state population.