The Kashmiris of Pakistan are living in various parts of Azad Jammu and Kashmir, while the Punjabis are an Indo-Aryan ethnic group originating from the Punjab region, found in Pakistan and northern parts of India [1, 2]. Blood samples were collected from unrelated individuals of known ethnic origin and family history with informed consent, after approval by the ethical review committee of the University of Health Sciences, Lahore. DNA was isolated from blood using ReliaPrep™ Blood gDNA Miniprep System (Promega, Madison, USA). Overall, 94 Punjabis and 101 Kashmiris were included. DNA was amplified using the AmpFlSTR Y-filer™ PCR Amplification (Yfiler) kit (Applied Biosystems, USA) targeting 17 Y-chromosomal short tandem repeats (Y-STRs) [3]. The amplified products were separated by capillary electrophoresis on an ABI Prism1 3130 xL Avant Genetic Analyzer (Applied Biosystems) using GeneScan™-500 LIZ™ internal size standard. Allele and haplotype frequencies, diversity estimates, and Rst values were calculated using the software package Arlequin version 3.5 [4], and provided in the supplementary Tables S1, S2, S3 and S4. Haplotype data were already made accessible via the Y-Chromosome Haplotype Reference Database (YHRD) under accession number YA003904 (Kashmiris) and YA003905 (Punjabis).

The Y-STR markers DYS635, DYS458, DYS391, and DYS392 showed the highest or lowest heterozygosities with 0.843, 0.769, 0.381, and 0.341, respectively (Table S2). A total of 151 different Y-STR haplotypes were identified from the total of 195 unrelated males analyzed, from which 133 (68.20%) were found in a single individual of the total dataset, while the most frequent haplotype was shared between ten individuals (all Kashmiris). The Punjabi population (0.996) had a higher haplotype diversity as compared to the Kashmiri population (0.983); the overall haplotype diversity was 0.994. The Punjabis had a higher discrimination capacity (87.23%) than the Kashmiris (68.3%); the overall discrimination capacity was 80.51%. The random match probability was 0.015 for the Punjabis and 0.029 for the Kashmiris. By using data from six additional Pakistani populations (four sampled in Pakistan and two in England), i.e., 71 haplotypes from Khyber Pakhtunkhaw, Pakistan [Yousafzai Pathan] (YHRD Accession No. YA003748), 269 haplotypes from FATA, Pakistan [Pathan] [5], 290 haplotypes from Punjab, Pakistan [Punjabi] [6], 100 haplotypes from Sindh, Pakistan [Sindhi] (YHRD Accession No. YA004152), 132 haplotypes from England-Wales, UK [British Pakistani] [7], and 136 haplotypes from London, UK [Indo-Pakistani] [8]; paternal genetic relationships between the populations were investigated via multidimensional scaling (MDS) analysis (Fig. S1) of transformed Rst distances (Table S4). The Kashmiris and Punjabis clustered together with four other Pakistani groups, while two groups (from England-Wales and Yousafzai Pathan) clustered more distantly.

Overall, our study demonstrates that the Yfiler kit detects high haplotype diversity in two populations from Pakistan, of which one (Kashmiri) was not previously studied, which in general makes it suitable for forensic casework in these groups. The recent inclusion of these data in the YHRD allows widespread use for forensic and other purposes.