Human papillomavirus type 31 (HPV-31), a member of the HPV-16-related alpha-9 species, accounts for approximately 4 % of invasive cervical cancers globally [1, 2]. Expression of early papillomavirus genes, including the E6 and E7 oncogenes, is controlled by the noncoding long control region (LCR) which contains binding sites for both the papillomavirus regulatory proteins (E1, E2), as well as for numerous cellular transcription factors. E2, the major viral regulatory protein, modulates the expression of E6 and E7 oncogenes through binding to consensus target sequences in the LCR. The LCR of high-risk papillomaviruses contains four highly conserved E2 binding sites (E2BSs) [35]. Beside the full-length E2, additional truncated forms of E2 protein (E2C, E8^E2, E8^E2C) have been identified for both high-risk and low-risk HPV types. These proteins can also recognize E2 binding sites and act as inhibitors of viral replication and transcription [6, 7].

DNA methylation is recognized as an important epigenetic mechanism modulating papillomavirus gene expression, and the epigenetic regulation of oncogenes is a notable factor in HPV-associated carcinogenesis. The methylation status of the virus genome varies during the viral life cycle and disease progression [810]. The DNA methylation pattern of HPV-16, the most prevalent carcinogenic type, has been studied extensively, but those of other high-risk types, including HPV-31, are less studied [911]. Since HPV-31 also frequently causes cytological abnormalities [12], we investigated the methylation pattern of the HPV-31 LCR in clinical samples and determined the methylation level of CpG sites using next-generation sequencing after bisulfite modification.

HPV detection and genotyping was performed on a routine diagnostic base in exfoliated cervical cells of patients presenting with cytological abnormalities. The 22 HPV-31 positive cases were classified into two groups: the first being patients with lesions progressing to cervical intraepithelial neoplasia grade 2 or 3 (CIN2+) within 1 year; the second being patients with cytological abnormalities that did not progress (≤CIN1). Quantitative DNA methylation analysis by next-generation sequencing focused on CpG sites in the whole HPV-31 LCR (from nt 7066 to 97) and the 5′-end of the E6 gene. The methylation status of cytosines within the HPV-31 LCR was determined after bisulfite modification of the target DNA. The CpG sites in the HPV-31 LCR were covered by four amplification products (7066–7511, 7215–7806, 7514–81, and 7670–345 based on reference sequence, GenBank accession number: J04353). Long read sequence information was obtained using a Roche GS Junior System. The proportion of thymine at each cytosine residue indicated the conversion of unmethylated cytosine during bisulfite modification.

Single nucleotide changes and deletions characteristic of HPV-31 variants from different lineages were found in the sequenced LCR fragments, and in 16 specimens the results of the next-generation sequencing of bisulfite-modified DNA were concordant with previously reported results obtained by non-modified Sanger sequencing [13]. As the bisulfite modification does not affect 30 of the 39 lineage-specific SNPs, it allowed us to identify the HPV-31 lineages. Altogether, 14 of the HPV-31 isolates were lineage C, and 8 were lineage B. The CIN2+ group consisted of two lineage B infections and five lineage C infections.

All 12 potential methylation sites within the HPV-31 LCR (CpG positions: 7170, 7391, 7414, 7419, 7479, 7485, 7537, 7870, 7876, 40, 54, 60) were examined. Sixteen samples displayed hypomethylated HPV-31 LCR sequences. CpG methylation was observed at one position in the 5′ LCR, four positions in the enhancer, and two positions in the promoter region (Fig. 1). Four of these methylated CpGs (7479, 7485, 7876, and 40) mapped to three of the E2 binding sites (E2BS1-3) of the LCR, and all six cases (sample IDs: 8249, 9417, 9538, 5357, 7146, and 7208) with CpG methylation in the LCR showed methylation (positions 7479 or 7485, or both) at the promoter distal E2 binding site, E2BS1 (Table 1).

Fig. 1
figure 1

Schematic representation and methylation of HPV 31 LCR and 5′-end of the E6 gene in clinical samples. a Structure of HPV 31 LCR and regulatory factor binding sites are represented. AE auxiliary enhancer, KE keratinocyte enhancer, SID sample identifier. Selected cellular factor binding sites are indicated as white boxes and binding sites of regulatory viral proteins E1 and E2 are shown by gray and black boxes, respectively. Single arrow shows the transcription start site of E6 gene (p97). b CpG positions in HPV 31 LCR are indicated as vertical bars. Numbers in circles show the positions of individual CpG sites. Methylated CpGs are indicated with filled circles of different colors. For each specimen, chain of circles indicates the length of sequence data

Table 1 CpG methylation frequencies (%) exceeding 10 % in HPV 31 LCR and in 5′-end of the E6 gene

Of the promoter proximal E2 binding sites, only partial CpG methylation was detected at two of the E2BS sites (E2BS2 and E2BS3) in three of the six cases (sample IDs: 9538, 7146, and 7208; Table 1). However, in each of the three cases, the partial CpG methylation of the promoter E2 binding sites was associated with marked methylation in the promoter distal E2 binding site (E2BS1) (Fig. 1). Two of the lineage C CIN3/CIS (carcinoma in situ) cases (sample IDs: 7146, 7208) had multiple methylated CpGs, particularly in the promoter distal E2 binding site, while other two CIN3/CIS cases (sample IDs: 7404 and 7527) were hypomethylated (Fig. 1). On the other hand, CpG methylation of the promoter distal E2 binding site was found in four ≤CIN1 cases (sample IDs: 8249, 9417, 9538, and 5357; Table 1).

In addition to LCR analysis, the study setting allowed the analysis of six additional CpG sites (104, 135, 157, 176, 275, and 287) in the 5′-end of E6 gene. This region also revealed hypomethylation in the majority of samples. Partial methylation was seen in two cases: a ≤CIN1 group specimen (sample ID 7110) with 12 % methylation of CpG-135 and CpG-157 and a CIN2+ group specimen (sample ID 7404) with 14 % methylation of CpG-157.

Previous studies of CpG methylation in HPV-31-related disease have reported only the mean methylation of each CpG group tested, and covered only five [14] or eight [15] of the 12 CpG sites in the HPV-31 LCR. Here we present the methylation data from individual patients, covering all the CpG groups, which allows a more nuanced analysis. In addition, our analysis of cases with CpG methylation in the LCR reveals which of the CpG sites were exposed to methylation at early stages of host cell transformation.

The CpG methylation of the most prevalent high-risk type, HPV-16, has been in the focus of numerous studies [9, 10, 16] and HPV-31 together with HPV-35 are the most closely related types to HPV-16 [17]. In HPV-16 infections leading to transformation, the promoter is hypomethylated, while the distal parts of the LCR including the promoter distal E2BS are methylated [9]. In CIN2+ and invasive cancer, CpG methylation in the HPV-16 LCR appears to be an optional alteration occurring in some infections and missing in others [9, 1820]. Binding of E2 to the high affinity binding site E2BS1 (promoter distal E2BS) in the enhancer region activates the early promoter P97, resulting in enhanced production of early proteins including E2 and E6/7. Increased levels of E2 can bind also to the low affinity promoter proximal binding sites E2BS2-4 and inhibit early promoter activity, thus repressing the transcription of E6 and E7 oncogenes [3, 4]. Furthermore, in vitro experiments have indicated that methylation of CpGs within E2BS1 significantly increases HPV-16 early promoter activity [9].

In HPV-31 LCR, there are 12 CpG sites, seven of them in E2 binding sites (E2BS) [13]. The majority of our samples had hypomethylated CpGs both in the 5′ LCR/enhancer and in the promoter regions without significant variation between patient groups, and these results are consistent with the observation of previous studies investigating HPV-31 [14, 15]. On the other hand, we observed methylation at seven CpGs in the HPV-31 LCR in six clinical specimens. Four of these methylated CpGs overlap E2 binding sites E2BS1-3 with uniformly methylated promoter distal E2BS1 in all six samples. Methylation frequencies were low in the promoter proximal E2 binding sites in the promoter region and higher in the distal part of LCR, involving marked methylation of E2BS1. This suggests a role for CpG methylation of the promoter distal E2BS1 in the development of cervical squamous abnormalities associated with HPV-31 infections. This observation is in good agreement with previous findings reported by Vinokurova et al. [9] and suggests that HPV-31 uses a regulatory mechanism similar to that used by HPV 16 and possibly by the entire HPV-16 group (alpha-9 species). In conclusion, this study has identified the promoter distal E2 binding site (E2BS1) as a target for CpG methylation within the HPV-31 LCR, which, by analogy with HPV-16, suggests a mechanism for activation of papillomavirus transcription.