The establishment of short tandem repeat on the Y-chromosome(Y-STR) database can provide essential data for forensic identification, population genetics, human evolution, and other researches [1,2,3]. Chaoshan region is the birthplace of the Minnan Teochew dialect. This dialect is quite different from other dialects (Cantonese, Hakka, and Leizhou Min) in Guangdong. Besides, Chaoshan culture is entirely different from its neighbors in Guangdong and the rest of China. As part of Chaoshan region, Jieyang is a prefecture-level city in Guangdong Province, the People’ Republic of China (Fig. S1). According to 2010 census (www. stats. gov. cn), Jieyang city has a population of 5,877,025. So far, there are no data about genetic polymorphisms of Y-STR in Chinese Teochew people. In this study, we collected bloodstains from 328 unrelated healthy Teochew males, whose families have lived in Jieyang for three or more generations. All participants had signed informed consent letter. This study was conducted under the recommendations of ISFG on the use of YSTRs in forensic analysis [4].

Bloodstains were directly amplified by the PowerPlex® Y23 System (Promega, Madison, Wi, USA) [5]. Polymerase chain reaction (PCR) was performed on GeneAmp PCR 9700 thermal cycler (Thermo Fisher Scientific, Foster City, CA, USA) according to the manufacturer’s recommendations. Then, PCR products were loaded on ABI3500xL Genetic Analyzer (Thermo Fisher Scientific), and then the resutls were analyzed by GeneMapper ID-X software (Thermo Fisher Scientific). Haplotype diversity (or genetic diversity) was calculated as HD = (n/n − 1) (1 − Σ Pi2), (Pi represents the frequency of the i-th allele or haplotype; n represents the number of samples). The discrimination capacity (DC) was calculated as m/n (m represents the number of different haplotypes; n represents the total number of samples). Analysis of molecular variance (AMOVA) and multidimensional scaling (MDS) tests were conducted by online tools of Y-STR Haplotype Reference Database (YHRD, http://www.yhrd.org). The neighbor-joining phylogenetic tree was constructed by the MEGA 6.0 software, based on the Rst values.

The allele frequencies of each Y-STR locus are shown in Table S1. Haplotype diversity ranged from 0.3905 (DYS438) to 0.9704 (DYS385a/b). A total of 293 haplotypes were detected, among which 269 were unique haplotypes (Table S2). The total haplotype diversity (HD) and discrimination capacity (DC) were 0.9991 and 0.8933, respectively. During this study, there were a tri-allelic genotype and a tetra-allelic genotype observed at locus DYS385a/b. What is more, there were two samples with a null allele at locus DYS448 and one sample at locus DYS570. The three samples were also detected by Yfiler® Plus kit, and the genotyping results were in consistent with the data detected by PowerPlex Y23® kit, except allele 19 at locus DYS570 (Fig. S2). Null alleles at Y-STR loci have been reported and DYS448 deletion appears relatively frequent in Asians [6,7,8,9].

Our data have been submitted to YHRD as Jieyang, China [Han], and its accession number was YA004506. In this study, we chose nine different populations with 3220 haplotypes as reference populations, and their geographical locations were shown in Fig. S3. The population pairwise genetic distances (Rst) with p values were presented in Table S3. After Bonferroni correction, there were no significant differences in Rst values between Jieyang Han and Guangdong Han, Hunan Han (p > 0.0011, 45pairs). As shown in Fig. S4, Jieyang Han and most of southern China-related Han populations were distributed in bottom right quadrants of the MDS plot. In the phylogenic tree (Fig. S5), three southern China-related groups, Jieyang Han, Hunan Han, and Guangdong Han, clustered at the same branch. Henan Han, Beijing Han, and Jiangsu Han clustered together and the genetic relationships between Jieyang Han and these populations were very close. Minnan Han, Fujian She, Meizhou Hakka, and Guangxi Zhuang kept away from Jieyang Han. Although Meizhou and Jieyang are close in the geographical distribution, Jieyang Han was quite different from Meizhou Hakka, which showed that the genetic relationship might be not significantly related on the geographical distance between populations. Through the gradual integration with local Cantonese, Jieyang Han might have developed close genetic relationships with southern China-related Han populations. However, larger sample sizes, more genetic markers, more population data, and further investigations are needed to confirm the genetic relationships between Teochew people and other populations.

In summary, genetic polymorphisms of 23 Y-STR loci were investigated in Jieyang Han population. The MDS and phylogenetic tree indicated that Jieyang Han had close genetic relationships with southern China-related Han populations. Our study increased YHRD and could provide useful information for forensic investigation and population genetics.