Introduction

Human papillomavirus (HPV) is the most common sexually transmitted infection in the world [1]. HPV can cause several cancers, including cancer of the cervix, vulva, vagina, penis, anus, or head and neck [2,3,4,5,6]. HPV has more than 400 distinct HPV types, among which almost 50 types can infect the mucosa of anogenital areas [7]. Nevertheless, 14 HPV types, including HPV 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68 are considered to be high-risk (HR)-HPV types which can persistent and consequently lead to the development of above-mentioned cancers, particularly cervical cancer [8,9,10,11]. The ten most common HPV types are HPV 16, 18, 45, 33, 31, 58, 52, 35, 59, and 56 in cervical cancer with frequency of 55.4%, 16.1%, 4.7%, 4.1%, 3.8%, 3%, 2.8%, 1.9%, 1.2%, and 0.9%, respectively, worldwide [12]. The ten most frequent HPV types among Iranian women with cervical intraepithelial neoplasia 2–3 (CIN 2–3) were reported to be HPV 16 (48%), 18 (10%), 31 (7%), 45 (5%), 39 (4%), 33 (3%), 58 (3%), 35 (2%), 51 (2%), and 52 (2%) [13].

When the DNA sequence of the L1 gene is different more than 10% from each other, it is designated to be a distinct HPV type. Within a given type, the difference of 1–10% and 0.5–1% through the complete genome was called lineage and sublineage, respectively. HPV 51 has two lineages, including A and B, and six distinctive sublineages comprising A1, A2, A3, A4, B1, and B2. HPV 59 consists of two lineages A and B that Lineage A grouped into three different sublineages A1, A2, and A3 [14].

Studies suggest that the distinct lineages and sublineages of HPVs are different geographical distributions and may be associated with ethnicity. Indeed, the geographic associations for the different lineages and sublineages of HPV 16, 18, and 58 are well-documented [10, 15,16,17,18,19]. However, geographic associations for other HPV types such as HPV 51 and 59 are not well-documented. In this regard, further studies are mandatory in the world to elucidate whether there is an association between the distinctive lineages and sublineages of these HPV types and ethnicity or not.

While the distribution of HPV types is well-recognized [13], few studies carried out to investigate the lineages and sublineages of HR-HPV types in Iran including HPV 16, 18, 31, 45, 39, 56 [18, 20,21,22,23] and HPV 52/58 (unpublished data). The characterization of lineage and sublineage of HR-HPV types is helpful for future studies on different aspects, including epidemiology, evolution, pathogenicity, and biology. In line with these, this study carried out a nucleotide sequence analysis to find the circulating HPV 51 and HPV 59 lineages and sublineages in Iran.

Materials and methods

Study population

To characterize the lineages and sublineages of HPV 51 and HPV 59, a study was conducted from 2018 to 2020. One-hundred and forty-two formalin-fixed paraffin-embedded (FFPE) samples (98 invasive cervical cancer and 44 cervical intraepithelial neoplasia including 27 CIN 1 and 17 CIN 2–3 samples) were obtained from Immam-Khomeini Hospital in Tehran. One-hundred and thirty-five ThinPrep Pap Test samples were HPV positive as a pooled of 12 high-risk HPVs (31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68) and were detected using Cobas assay were included. Forty-four HPV 51- and twenty-two HPV 59-positive ThinPrep Pap Test samples (previously genotyped among 1000 samples by INNO-LiPA® HPV Genotyping assay) were also collected from referral laboratories in Tehran.

The informed consent that was approved by the local ethical committee of Tehran University of Medical Sciences (IR.TUMS.SPH.REC.1400.284) was signed by all participants of this study. The demographic data were obtained from their medical records.

Lineage and sublineage analysis of HPV 51 and HPV 59

DNA was extracted from ThinPrep Pap Test samples by the High Pure Viral Nucleic Acid Kit (Roche Diagnostics GmbH, Roche Applied Science, Mannheim, Germany) according to the manufacturer’s instructions. DNA from FFPE specimens was isolated using phenol–chloroform assay according to a previously published procedure [24].

One-hundred and forty-two FFPE samples were screened to detect the HPV genome using nested-PCR with MY09/MY11 and GP5+/GP6+ primer pairs targeting a 150 bp amplicon of L1 gene. All HPV-positive samples were directed to nucleotide sequencing by BigDye® Terminator v3.1 Cycle Sequencing Kit and a 3130 Genetic Analyzer Automated Sequencer as specified by Applied Biosystems manuals (Foster City, CA). Using Bioedit software, all sequences were edited and blasted in http://www.ncbi.nlm.nih.gov/blast/) to find HPV genotypes. All 135 ThinPrep Pap Test specimens which were positive as the pooled of 12 high-risk HPVs using Cobas assay were tested to detect HPV 51 and HPV 59 genome using hemi-nested PCR with sequence-specific primers of E6 gene as described below.

The complete E6 gene of HPV 51 (nucleotide [nt] 97-552) and partial sequence of Long Control Region (LCR) and E7 gene of HPV 51 was amplified by semi-nested PCR with the following primer pairs: GGTGTAACCGAAAAGGGTT (51E6-F1), CCGAAAAGGGTTATGACCGA (51E6-F2), and AGCTGTCAAATTGCTCGTAG (51E6-R) to obtain a 622 bp amplicon. The complete E6 gene of HPV 59 (nucleotide [nt] 55-537) also was tested by semi-nested PCR using following primer pairs: AAGACCGAAAACGGTGCATA (59E6-F), AATTGCTCGTAGCACACAAGG (59E6-R1), and AGTGTTGCTTTTGGTCCATGC (59E6-R2) to amplify a 561 bp fragment. The PCR reaction for both HPV 51 and 59 was performed in a 50 μl reaction mixture including 2 mM MgCl2, 50 μM of each dNTP, 10 pmol of each primer, 1.5 U of Taq DNA polymerase, and 100 ng of DNA template. The thermal cycling steps were as follows for the first and second rounds: 35 cycles of 95 °C for 30 s, 55 °C for 50 s, and 72 °C for 50 s and 35 cycles of 95 °C for 20 s, 55 °C for 40 s and 72 °C for 40 s, respectively. In every set of PCR runs, a reaction mixture lacking template DNA was included as a negative control. All the PCR products were direct to sequence by bidirectional direct sequencing as mentioned above.

To characterize HPV 51 and HPV 59 lineages and sublineages, the sequences of all studied samples were aligned to reference sequences HPV 51 and HPV 59, respectively. Reference sequences of HPV 51 including M62877 (A1), KF436870 (A2), KF436873 (A3), KF436875 (A4), KF436883 (B1), and KF436886 (B2) and reference sequences of HPV 59 were X77858 (A1), KC470261 (A2), KC470263 (A3), and KC470264 (B) [14]. The phylogenetic tree was built using the maximum likelihood method by Mega software version 11. The reliability of the phylogenetic tree was measured by the calculation of bootstrap with 1000 replicates.

Statistical analysis

The statistical analysis was achieved using Fisher’s exact test (Epi Info 7, Statistical Analysis System Software) and when the P-value was less than 0.05, it was considered to be statistically significant.

Results

Among 142 FFPE specimens, HPV 51 and 59 were detected in 1 (0.7%) and 4 (2.8%) of samples, respectively. Also, among 135 ThinPrep Pap Test specimens that were screened for these two types, HPV 51 and 59 were found in 6 (4.4%) and 8 (5.9%) samples, respectively. To determine the lineages and sublineages, in total, 51 HPV 51-infected specimens [CIN 2–3 (HSIL)/ICC = 9, CIN 1 (LSIL) = 8, and normal = 34 samples] and 34 HPV 59-positive samples [CIN 2–3 (HSIL)/ICC = 7 CIN 1 (LSIL) = 4, and normal = 23 samples] were included in this study.

Lineage analysis of HPV 51 showed that both the A and B lineages were found in our samples. The A lineage was detected in 41 out of 51 HPV 51-infected samples (80.4%) and the remaining samples were infected with the B lineage (19.6%). Among samples that were infected with the A lineage, all four different sublineages were detected as follows: the lineage A1 in 53.7%, A2 in 12.2%, A3 in 2.4%, and A4 in 31.7% (Fig. 1 and Table 1). Among samples that were infected with B lineage, sublineage B2 was dominant (70%) and in 30% of them, lineage B1 was detected. In total, the frequency of distinct sublineages was 43.2% (A1), 8.9% (A2), 1.9% (A3), 25.5% (A4), 5.9% (B1), and 13.7% (B2) (Table 1). Our results indicated sublineages A1 and A4 were dominant in Iran.

Fig. 1
figure 1

Phylogenetic analysis of HPV 51 (full sequences of E6 gene and partial sequences of E7 gene and LCR region) was conducted in MEGA11 by the Maximum Likelihood method based on the Kimura 2-parameter mode. The accession number of reference sequences were included M62877 (A1), KF436870 (A2), KF436873 (A3), KF436875 (A4), KF436883 (B1), and KF436886 (B2) which indicated by a black circle

Table 1 HPV 51 sublineages identified in normal, CIN 1 (LSIL), and CIN 2–3 (HSIL)/ICC samples of Iranian women

Sequence analysis of all samples showed that the nucleotide substitutions against the prototype sequence (M62877) were observed in 29 samples. These substitutions were happened in 10 positions including C71A/G, A72T, and T74A of LCR; C102T, T150C, T240C, A311G, and C395T of E6 gene; A584C and G656A of E7 gene (Table 1). Of seven nucleotide changes in E6/E7 genes, four changes at the positions A311G, C395T (E6 gene), A584C, and G656A (E7 gene) were non-synonymous and led to amino acid changes at positions of K72R and S100L of E6 protein and K9Q and E33K of E7 protein, respectively. The amino acid substitution at position S100L of E6 protein was more prevalent than other amino acid changes as it was found in 20 out of 51 (39.2%) samples, followed by E33K (15.7%), K72R (5.9%), and K9Q (5.9%).

Looking at the HPV 51 variants found in this study, seven different nucleotide substitution patterns were detected as follow: No change (43.2%), C71A/T74A (3.9%), C71G/T74A (5.9%), C71G/A72T/G656A (1.9%), C71G/C395T (25.5%), C71A/C102T/T150C/T240C/A311G/A584C (5.9%), and C71G/C102T/T240C/C395T/G656A (13.7%) (Table 1).

Stratification of sequences by histology/cytology status showed that although the A lineage was more prevalent among the CIN 2–3 (HSIL)/ICC group, a statistically significant difference was not observed (P = 0.74). About age groups, no statistically significant differences were found between the two studied groups (P = 0.99) (Table 2).

Table 2 The frequency of HPV 51 lineages stratified by histology/cytology status or age in cervical samples of Iranian women

Concerning the lineage investigation of HPV 59, our findings indicated that 32.2% of samples (11 out of 34 specimens) belonged to the A lineage and 67.8% of samples (23 out of 34 samples) were classified with the B lineage. All of the samples that were infected with the A lineage belonged to the A1 sublineage (Table 3, Fig. 2).

Table 3 HPV 59 sublineages identified in normal, CIN 1 (LSIL), and CIN 2–3 (HSIL)/ICC samples of Iranian women
Fig. 2
figure 2

Phylogenetic analysis of the HPV 59 E6 gene was made in MEGA11 using the Maximum Likelihood method based on the Kimura 2-parameter model. The accession number of reference sequences used in this study were as follows X77858 (A1), KC470261 (A2), KC470263 (A3), and KC470264 (B) were indicated by black circle

Eight nucleotide substitutions against prototype sequence (X77858) were found in this study at positions of T102C, G171T, T213C, G252C, C306T, A348G, T402C, and A403G. Among these, only the change of A to G at position 403 led to amino acid substitution at K117E of E6 protein and the other changes were silent mutations. As revealed in Table 3, seven distinct nucleotide substitution patterns were found in this study as follows: no change (32.2%), T102C/C306T/T402C (27%), T102C/G171T/C306T/T402C (27%), T102C/G252C/C306T/T402C (2.9%), T102C/T213C/C306T/T402C (2.9%), T102C/C306T/A348G/T402C (2.9%), and T102C/C306T/T402C/A403G (5.9%).

Stratification by histology/cytology status showed that the A lineage was more common in the normal group while the B lineage was more prevalent in the CIN 1 (LSIL) group. However, no statistically significant differences were observed (P = 0.90). Regarding age groups, no statistically significant difference was also found between the two groups (P = 0.78).

Discussion

In this study was found that 80.4% and 19.6% of HPV 51-infected samples belonged to the A and B lineages, respectively. All sublineages of A1, A2, A3, A4, B1, and B2 were detected in our samples. However, sublineages A1 (43.21%) and A4 (25.5%) were more prevalent than other sublineages (Fig. 1 and Table 1). This finding is almost in agreement with several previous studies. The result of one study from North America was shown that the A lineage is dominant and the B lineage was detected in fewer samples. Among distinct sublineages, A1 was common (73.4%) followed by A2 (20.1%), A3 (4.9%), and B (1.6%) [25]. A study from Costa Rica showed that both the lineage A (68.9%) and B (31.1%) distributed in this country and common sublineages were as follows: A1 (58%), A2 (8.7%), A4 (2.2%), and B1 (31.1%) [26]. A study from China reported that only the A lineage was detected and the B lineage was not present in China. Sublineage A4 was dominant (54.5%) and two sublineages of A1 and A2 were found in 9% and 36.5% of specimens, respectively [27]. In another study from Southwest China, sublineages A1, A2, and A4 were shown in 15%, 50%, and 35% of samples [28]. Conversely, the B lineage and sublineages of B2 were predominant in African countries [26]. Thoroughly, the results of these studies have revealed that common sublineages in the world were as follows: A1 in North America, A1 and B1 in Costa Rica, A2 and A4 in China, A1 and A4 in Iran, and B2 in Africa (Fig. 3). This diverse geographical distribution of distinct HPV 51 lineages and sublineages in the world, can reflect the coevolution of them with host ethnicity.

Fig. 3
figure 3

Comparison of the prevalence of lineages/sublineages of papillomavirus type 51 (A1-A4 and B1-B2) in Iran (this study) with available data from other countries in the world

In the present study, the difference in distinct lineages among the three studied groups did not reach a statistically significant level and the A lineage was prevalent in all three groups (Table 2). Interestingly, in Costa Rica where both lineages of A and B are co-circulating, was shown the B lineage may have more chance to progress cervical intraepithelial lesion 3 (CIN 3) in comparison to the A lineage [29].

Our results indicated that in 45.1% of samples amino acid changes happened.

The change of Serine to Leucine at position 100 (S100L) of E6 protein was the most common substitution. In agreement with our results, the amino acid change of S100L was reported as the most prevalent substitution. This substitution is designated to both the sublineages A4 and B2 which are more distributed in China and Africa, respectively [28].

About the lineage analysis of HPV 59, our results showed that 32.2 and 67.8% of samples belonged to lineages A and B, respectively. All of the samples that were infected with A lineage were the A1 sublineage (Table 3, Fig. 2). In North America, the lineages of A and B were found in 12% and 88% of samples, respectively [25]. In China, the lineages A and B were reported in 53.8% and 46.2% of samples, respectively. Different sublineages were detected as follows: A1 in 30.8%, A3 in 23%, and B in 46.2% of specimens [27]. The B lineage was common in Africa and the A lineage was prominent in Costa Rica [30] (Fig. 4).

Fig. 4
figure 4

Comparison of the prevalence of lineages of papillomavirus type 59 (A and B) in Iran (this study) with available data from other countries in the world

Regarding the fact that no association was observed between HPV 51 or HPV 59 variants with pathological stages, it is more likely that the low sample size does not allow for observed major differences (Table 4).

Table 4 The frequency of HPV 59 lineages stratified by histology/cytology status or age in cervical samples of Iranian women

Conclusion

Our results showed that the A lineage, sublineages of A1 and A4, of HPV 51 are more prevalent and distributed in Iran. Concerning HPV 59, both lineages A and B were detected in our samples. However, further studies with larger sample sizes are mandatory to estimate the pathogenicity risk of HPV 51 and 59 variants in Iran. The integration status of these types into the host genome can be examined in the future. It is highly recommended that the characterization of HLA molecules in different HPV 51 and 59 variants be considered in future studies.