Introduction

The analyses of human short tandem repeats (STRs), or microsatellite loci have become a useful tool in forensic genetics due to variation in repeat number, high levels of diversity and stable heredity in the human genome. Their repeat number can also be amplified faithfully using polymerase chain reaction (PCR) (Edwards et al. 1992; Kayser et al. 1997). In general, binary markers such as single nucleotide polymorphisms (SNPs) are best suited for studies of ancient divergences in human evolution, since they tend to have low probabilities of back and parallel mutation, and for which ancestral states can be determined (Hammer and Zegura 1996). In contrast, the genetic features of STR loci may provide more useful information for investigating and reconstructing the phylogeny of the more recently diversified human lineages (Hammer and Zegura 1996; Forster et al. 2000), as well as for forensic and paternity testing (de Knijff et al. 1997).

Applications of STR analysis in forensic casework benefit from large population databases for estimating the probability of identity by chance (Allen et al. 1998; Pfeiffer et al. 2001; Imaizumi et al. 2002). The use of additional STR markers would provide enough forensic parameters for more difficult cases in paternity or maternity analyses, such as deficient cases (i.e., only alleged father and the child are included), missing persons or when mutations are encountered. Many forensic communities proposed the inclusion of additional loci, since the potential false matches with a large number of comparisons being made within and between databases (Weir 2007; Schneider 2009; Hill et al. 2011). Thus, it is important that forensic genetic databases of STR loci continue to be expanded, and become more reliable to provide a better tool for forensic analysis. Although, several databases of STR loci have been published and are used in forensic and population genetics in Mongolia (Varga et al. 2003; Kwak et al. 2005; Zha et al. 2014), the amount of available data for STR loci in the Mongolian population is still limited. Thus, analysis of extended STR loci may potentially be a powerful tool for forensic analyses in the Mongolian population. It leads us to investigate further reliable STR data sets and evaluation their usefulness from the Mongolian population to expand the database for the forensic community.

In this study, thus, we have analyzed 24 loci including autosomal and Y-chromosomal STRs, Y-indel, and sex-determining marker in 267 unrelated individuals from the Mongolian population using the GlobalFiler™ PCR Amplification Kit to provide an expanded and more reliable forensic database.

Materials and methods

Subjects and DNA extraction

In this study, we studied 267 healthy Mongolian DNA samples (Khalkh, n = 216; Bayad, n = 10; Dorwod, n = 9; Kazakh, n = 7; Khotgoid, n = 4; Zakhchin, n = 4; Buriad, n = 3; Torguud, n = 3; Darkhad, n = 2; Uriankhai, n = 2; Uuld, n = 2; Dariganga/Khoton/Myangad/Sartuul/unknown, each n = 1) selected at random (and therefore likely to be unrelated) from Ulaanbaatar in Mongolia. Genomic DNA was extracted from buccal swab using Exgene™ Clinic SV kit (GeneAll, Korea) according to manufacturer’s instructions. A separate written informed consent was obtained from all donors before collecting their buccal swab.

PCR and genotyping

PCR amplification of 24 loci (D3S1358, vWA, D16S539, CSF1PO, TPOX, D8S1179, D21S11, D18S51, D2S441, D19S433, TH01, FGA, D22S1045, D5S818, D13S317, D7S820, SE33, D10S1248, D1S1656, D12S391, D2S1338, DYS391, Y-indel, and Amelogenin) was performed using the GlobalFiler™ PCR Amplification Kit (Applied Biosystems, Foster City, CA, USA). PCR reaction was performed on a GeneAmp PCR System 9700 (Applied Biosystems, Foster City, CA, USA) according to the manufacturer’s recommendations. PCR products were confirmed by 2% agarose gel electrophoresis. Amplified PCR products were analyzed by capillary electrophoresis using an ABI 3500xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) with manufacturer provided allelic ladders, bins, and panels. GeneScan 600 LIZ (Applied Biosystems, Foster City, CA, USA) was used as a size standard for capillary electrophoresis.

Data analysis

The genotype data was analyzed using the GeneMapper ID-X software (Applied Biosystems, Foster City, CA, USA) and Microsoft Excel (Microsoft, Redmond, WA, USA). The exact test was performed for assessing the Hardy–Weinberg equilibrium (HWE) using PowerMarker version 3.25 software (Liu and Muse 2005). Pair-wise genetic distances (F ST) was calculated by Phylip version 3.695 (Felsenstein), F ST values were visualized by multidimensional scaling (MDS) plot using the IBM SPSS Statistics 23 (IBM Korea, Korea). Forensic statistical analysis including allele frequencies, heterozygosities, and polymorphism information content (PIC) was performed with PowerMarker version 3.25 software (Liu and Muse 2005). Forensic paternity testing was calculated using PowerStats version 1.2 (Tereba 1999).

Results and discussion

We assessed statistical parameters of 24 loci including autosomal and Y-chromosomal STRs, Y-indel, and sex-determining marker using the GlobalFiler™ PCR Amplification Kit in a sample of 267 unrelated individuals from the Mongolian population. Khalkh among 15 Mongolian minor-groups accounts for about 80% of the entire Mongolian population. Genetic characteristics of 24 GlobalFiler PCR Amplification kit loci are shown in Table 1. In addition, their allele frequencies and forensic parameters were listed in Table 2. All the loci were found to be highly polymorphic in the population. Exact test demonstrated that no significant deviations from the Hardy–Weinberg equilibrium were observed except CSF1PO and FGA.

Table 1 Genetic characteristics of 24 GlobalFiler PCR Amplification kit loci in the present study
Table 2 Allele frequencies and statistical parameters for twenty-one autosomal loci of GlobalFiler PCR Amplification kit in the Mongolian population (n = 267)

The genetic approach to assess the probability used here is to provide valuable information for forensic applications. A total of 267 different DNA profiles were found in this work. The highest gene diversity was observed in the SE33 (0.9376) locus, and the lowest value was found in the TPOX (0.6142) locus. This result indicates that the SE33 is the most valuable marker from 24 STR loci surveyed here. Although the individual power of discrimination estimates varied at the studied loci, combined probability of a match (PM) from the 21 STR loci was estimated to be 1.139 × 10−24, which is highly informative.

Although Mongolian forensic DNA laboratories have been generated reliable population data sets using standardized genetic markers (i.e., 13 CODIS STR loci), it is important that forensic genetic databases of STR loci continue to be expanded, and become more reliable to provide a better tool for forensic analysis (Varga et al. 2003; Kwak et al. 2005; Zha et al. 2014). For example, the PowerPlex-16 system has been used as capable of simultaneously amplifying all 13 CODIS STR, amelogenin, and two pentanucleotide STR loci, Penta D and Penta E (Sprecher et al. 2000; Krenke et al. 2002). The 13 CODIS core STR loci are located on 12 different chromosomes, with CSF1PO and D5S818 both residing on chromosome 5, which are separated by approximately 24 centiMorgans (cM) (Bacher et al. 2000). Therefore, it would be expected that the values for paternity index and power of exclusion for the 13 CODIS STR set will be diminished from those expected for completely unlinked loci (Lins et al. 1998) (i.e., ≥50 cM apart). In this study, the combined PM value calculated from the unlinked 17 STR loci (Table 1) is 4.23 × 10−20, which is also highly informative.

There are known to be about 20 ethnic Mongolian groups, and many people of mixed ethnic origin; the population of Mongolia is known to be homogeneous, with Mongolian-speaking people constituting 95% of the total; the largest subgroup is the Khalkh, accounting for about 80% of the total population. The only substantial non-Mongol groups, representing over 5% of the population, are the Kazakhs, a Turkish-speaking people dwelling in the far West (http://www.un-mongolia.mn). A population comparison based on pairwise F ST genetic distances calculated from allele frequencies of 15 shared STR loci (D2S1338, TPOX, D3S1358, FGA, D5S818, CSF1PO, D7S820, D8S1179, TH01, vWA, D13S317, D16S539, D18S51, D19S433, D21S11) from obtained 25 different Eurasian and African populations is shown in Table 3 (Dobashi et al. 2005; Kraaijenbrink et al. 2007; Toscanini et al. 2015; Yuan et al. 2014; Omran et al. 2009; Sadam et al. 2015; Chaudhari and Dahiya 2014; Tie et al. 2006; Park et al. 2016; Maruyama et al. 2008; Ramos-González et al. 2016; Ota et al. 2007; Smith et al. 2009; Piatek et al. 2008; Almeida et al. 2015; Novković et al. 2010; Tillmar et al. 2009; Babiker et al. 2011; Rerkamnuaychoke et al. 2006; Hill et al. 2013). A multi-dimensional scaling (MDS) plot for 25 Eurasian and African populations by using pairwise F ST genetic distance values was depicted in Fig. 1. The plot showed three distinct clusters (Asians, Europeans/Hispanic, and African). As expected, the Koreans are clustered with Mongolian ethnic groups and East Asian groups including Chinese and Japanese populations (Fig. 1). This result was consistent with a previous report derived from datasets of mitochondrial DNA and Y-chromosome markers (Jin et al. 2009). MDS plot showed that Mongolians were clustered into Europeans and Asians, although Mongolia is geographically located in Northeastern Asia.

Table 3 Pair-wise F ST genetic distances of Mongolian and other populations using 15 shared STR loci
Fig. 1
figure 1

Multidimensional scaling (MDS) plot based on the result of F ST genetic distances. The Mongolians are represented by a closed diamond; Asian by closed squares; European origin by open squares; American by closed circles; and African by open circles

In conclusion, our data can be used to extend the results obtained with other STRs, as well as provide valuable information for forensic and population genetic studies in the Mongolian population.