Keywords

1.1 Introduction

Papillomaviruses are defined by their (1) non-enveloped capsids, (2) circular double-stranded DNA genomes with sizes close to 8000 bp and highly conserved gene organization, (3) host species specificity, (4) tropism for epithelial cells, and (5) transforming rather than lytic effects on the host cells. They cause neoplastic growth of the infected epithelium or can persist in asymptomatic infections.

Papillomavirus genomes have a noncoding region (long control region, LCR) of about 800 bp, which harbors the replication origin, a transcriptional enhancer, and a promoter (Bernard 2013). About half of the genome downstream of the promoter contains the early genes E6, E7, E1, E2, E4, and E5, with the remainder further downstream encoding the late proteins L2 and L1 (Fig. 1.1). For the purpose of this chapter, the following brief summary of the function of these proteins may suffice: E6 and E7 trigger the principal transforming mechanisms, such as interference with p53 and RB cell cycle control (Roman and Munger 2013; vande Pol and Klingelhutz 2013). E1 binds the replication origin and functions as helicase (Bergvall et al. 2013). E2 functions as activating and repressing transcription factor and cooperates with E1 in identification of the replication origin (McBride 2013). And L1 and L2 are the major and the minor capsid proteins (Buck et al. 2013; Wang and Roden 2013).

Fig. 1.1
figure 1

Genome organization of human papillomavirus 16

The papillomavirus life cycle and papillomavirus pathogenesis can be summarized as follows: Papillomaviruses most often infect squamous, i.e., multilayered and differentiating epithelia. In order to establish a stable infection, a papillomavirus particle has to infect the basal layer of such an epithelium, where the circular viral episome persists and replicates. Asymmetric cell divisions of the basal cells lead to suprabasal cells, beginning with the spinous layer. Suprabasal cells normally lack mitotic activity and DNA replication properties. Papillomavirus genomes that are sorted into such cells express E6 and E7 oncoproteins, which target the cellular Rb and p53 proteins, and thereby reestablish an environment of continuing DNA replication and mitoses. The resulting expansion of the suprabasal cell population leads to neoplastic lesions, referred to in the case of skin as “warts.” In cell layers close to the epithelial surface, the virus expresses the capsid proteins, which encapsidate the viral DNA into viral particles that are released upon disintegration of terminally differentiated epithelial surface cells. This life cycle also applies during nonmalignant infections to those papillomaviruses that are found in cancer, while it becomes distorted during carcinogenesis. All details of carcinogenesis are not yet understood, but early molecular events frequently involve recombination between papillomaviruses and host cell DNA in a genomic arrangement that leads to stimulation of papillomavirus oncogene transcription. For progression to a malignant phenotype, the affected cell has to undergo numerous additional mutations and epigenetic changes of cellular genes (see Mine et al. 2013, and references therein).

For taxonomic purposes, papillomaviruses are referred to as “types,” and their names are abbreviated with the letters PV, preceded by one or two letters that define the host, and followed by a number indicating the historic sequence of isolation (Bernard et al. 2010). Among more than 300 papillomavirus types described so far, only the cottontail rabbit papillomavirus (CRPV1) and those human papillomaviruses (HPVs), which are most prevalent in carcinomas of the cervix uteri (HPV16, 18, 31, 33, 35, 45, 52, 58), were addressed by DNA methylation research. No methylation studies have been done with the well-developed cell culture system for bovine papillomavirus-1 (BPV1) or in situ with those HPVs that cause mostly benign lesions such as HPV2 (common warts) or HPV6 (genital warts). These particular PVs are stably maintained as episomes, while the aforementioned PVs can recombine with cellular DNA, what may influence viral DNA methylation.

1.2 History of Papillomavirus Methylation Research

The first records of PV DNA methylation were garnered with CRPV1 (at that time also called Shope papillomavirus), which was shown to have methylated, chromosomally integrated, multicopy viral DNA in rabbit skin tumors (Wettstein and Stevens 1983; Sugarawa et al. 1983). These observations preceded the modern understanding of the regulatory importance of epigenetic alterations. Unfortunately, the CRPV/rabbit system has never been reinvestigated since then. Subsequently, the potential for transcriptional effects of DNA methylation on HPV16 and HPV18 DNA became established in vitro and in cell culture experiments (Thain et al. 1996; Rösl et al. 1993), but these observations were not extended to a search of methylated HPV DNA in situ. Several years later it turned out that HPV DNA methylation is actually a widespread phenomenon, and became observed in a cell line with episomal HPV16 DNA (Kim et al. 2003), in cell lines with integrated HPV16 and HPV18 DNA as well as in carcinomas and their precursor lesions (Badal et al. 2003, 2004; Kalantari et al. 2004). While these studies opened a rich field of investigation, their analytical power was initially limited by the use of methylation sensitive restriction enzymes, and by a focus on small genomic regions, including E2 binding sites. In the last 10 years DNA methylation studies by bisulfite sequencing have targeted larger parts of the genomes or whole genomes of several HPV types in the context of the viral life cycle and carcinogenesis, and these findings will be reviewed in this chapter.

The following questions emerged as the most challenging research objectives:

  1. 1.

    Does DNA methylation affect HPV biology during the normal viral life cycle and are HPVs unique DNA methylation targets, e.g., as a form of cellular defense against foreign DNA?

  2. 2.

    Are there specific regulatory effects of DNA methylation via CpG dinucleotides in E2 binding sites?

  3. 3.

    Do HPVs affect the cellular epigenome?

  4. 4.

    Does DNA methylation differentially affect HPV genomes during progression of asymptomatic infections through precursor lesions to malignant lesions?

  5. 5.

    Are HPV epigenomes or cellular epigenomic properties in HPV infected cells useful biomarkers in the diagnosis of cancer precursor lesions?

1.3 Methylation of HPV DNA During the Normal Life Cycle

In order to understand a potential role of HPV DNA methylation during the normal life cycle, it would be desirable to investigate the DNA first in the viral capsid, then immediately following infection of the basal layer, and further during epithelial differentiation. Unfortunately, HPV research here and elsewhere has always been hampered by the absence of animal models and by difficulties to establish or reproduce cell culture systems. As a consequence, there are presently only two sources of information about the epigenetics of HPVs during the viral life cycle, namely, the epigenetic properties of HPV DNA from patients likely to harbor only episomal DNA and a stable cell line, W12E, that maintains HPV16 DNA episomally.

HPV16, HPV18 and several related high-risk HPV types infect squamous mucosal cells of the female genital tract subclinically, and these infections can progress through cervical cancer precursor lesions (cervical intraepithelial neoplasia I and III, CIN I and CIN III) to invasive cervical cancer. In subclinical infections and CIN I lesions, HPV genomes exist as episomes, while an increasing portion of them recombines during progression. Consequently, it can be assumed that clinical samples obtained from asymptomatic individuals and CIN I patients contain HPVs during the normal viral life cycle. Studies from numerous labs agree that such clinical samples contain “sporadically” methylated HPV genomes, “sporadically” referring to average methylation frequencies per CpG in the range of 5–10%, and a lack of specificity for the CpG target (Kalantari et al. 2004; Turan et al. 2006; Brandsma et al. 2009; Sun et al. 2011; Mirabello et al. 2013). All of these studies addressed CpGs in the LCR and the L1 gene, but some extended the findings throughout the genome. These data constituted methylation analyses of short PCR amplicons. This is a nontrivial limitation, as the sequencing of multiple cloned amplicons from each patient sample found substantial heterogeneity of methylation between HPV genomes from the same sample (Kalantari et al. 2004).

This investigation of clinical samples became complemented by cell culture studies. Among the few cell culture models in papillomavirus research are W12E cells cloned from a CIN lesions of an HPV16-infected patient. The cells grow on a fibroblast feeder layer and morphologically resemble the basal layer of epithelia. Some differentiation can be observed in confluent cultures, and these differentiated cells can be separated from the undifferentiated cells. Lambert and colleagues (Kim et al. 2003) observed a consistent, but only “sporadic” methylation of the HPV16 LCR in undifferentiated cells, similar to patterns observed in situ in cells harvested from asymptomatic or CIN I patients. Most of this methylation was lost upon differentiation of the W12E cells. It is known from studies addressing different aspects of HPV biology that transcription of HPVs becomes activated upon differentiation, and so it is tempting to hypothesize that the observed epigenetic change is the switch between two different transcription states. It should be noted that methylation was only rarely observed at the E2 binding sites overlapping with the E6 promoter, which would activate rather than repress this promoter (see below).

Another study of W12E cells confirmed the methylation of the HPV16 LCR in undifferentiated cells, as well as the ensuing demethylation upon differentiation (Kalantari et al. 2008a). This study also addressed five clonal derivatives of W12E, where all HPV16 genomes had recombined with the cellular DNA. Three of these clones with few HPV16 copies had nearly no methylation of LCR sequences but some methylation of the L1 gene, which is adjacent to the LCR but not transcriptionally affected by its properties. Two clones with numerous HPV16 copies showed strong methylation of the LCR. In contrast to the W12E cells with episomal DNA, differentiation of these five clones with chromosomally integrated viral DNA did not alter HPV16 DNA methylation, neither in the LCR nor in the L1 gene.

1.3.1 In Summary

Studies of clinical samples as well as the W12E line agree that episomal HPV16 DNA is targeted by DNA methylation. DNA methylation is sporadic, i.e., low, and polymorphic both within an individual sample as well as between comparable samples. Differentiated W12E cells contain completely unmethylated HPV16 LCR segments, and such molecules exist in most clinical samples with episomal DNA. It is therefore likely but not mechanistically understood that HPV16 episomes are methylation targets in undifferentiated epithelial cells. This should negatively affect transcriptional activity. Demethylation may release this repression in suprabasal cells and lead to increased transcription, as observed in situ. No evidence suggests a selective recognition of the viral DNA as part of a cellular defense mechanism.

1.4 Regulatory Effects of DNA Methylation via CpG Dinucleotides in E2 Binding Sites

The papillomavirus E2 gene encodes proteins that have the ability to bind the palindromic DNA sequence 5′-ACCGNNNNCGGT-3′, which occurs four times in the LCR of HPV16 and related HPV types. This sequence has two CpG methylation targets, and in vitro studies have shown that E2 proteins cannot bind the methylated target sequences (Thain et al. 1996). As expected, transfection experiments with unmethylated and methylated E2 site reporter genes and E2 factor expression vectors confirmed that methylation dramatically interferes with transcriptional transactivation (Kim et al. 2003).

This straightforward mechanism in vitro is much more complicated in vivo, on the one side due to the expression of different E2 proteins through differential splicing, some being transcriptional activators, some lacking the transcription activation domain, and on the other side due to multiple and opposing functions of E2 binding sites depending on the genomic context. E2 proteins can be (1) activators of transcription when their binding site is remote from a promoter, the binding sites functioning as E2 protein dependent enhancers. Alternatively, they can (2) repress transcription, when they bind target sites at the HPV E6 promoter, in part due to competition between E2 and the promoter factors SP1 and TFIID, whose binding sites overlap with E2 binding sites (Tan et al. 1994), and in part due to E2 complexes with histone modifying proteins (Smith et al. 2014). Lastly, (3) E2 also forms a complex with the replication factor E1 and increases its specificity and affinity to replication initiation sites, and (4) is involved in partitioning of papillomavirus genomes during mitosis (McBride 2013).

HPV methylation studies normally address only the second of these four functions. The reasoning goes as follows: For E2 protein to be expressed, the HPV genome must be continuous from the E6 promoter through the whole E2 gene, as E2 is translated from a polycistronic mRNA containing the E6, E7, E1 and E2 genes. This is the case when HPV genomes are episomal or exist as tandem repeats recombined with chromosomal DNA. In these two cases, the E2 protein serves a repressing feedback loop, binds to the E6 promoter, and decreases its activity. In this scenario, HPVs and their infected cells would have a growth advantage, if the E2 binding sites overlapping with the E6 promoter would be methylated, as E2 protein could not bind and could not lead to repression, increasing the amount of E6 and E7 oncoprotein production. No such advantage of host cells with HPV genomes with methylated E6 promoter sequences exists if no complete E2 transcript (and protein) can be delivered, which is the case when chromosomal recombination led to interruption of the E2 gene, a frequent scenario in cancer (see below).

There is agreement that these scenarios are regularly encountered, but different extents of this mechanism were reported in different studies (Schwarz et al. 1985; Kalantari et al. 2001; Peitsaro et al. 2002; Arias-Pulido et al. 2006; Bhattacharjee and Sengupta 2006; Brandsma et al. 2009; Snellenberg et al. 2012; Chaiwongkot et al. 2013; Mirabello et al. 2013; Bryant et al. 2014). Reasons for disagreement are technical limits to differentiate between integrated and episomal viral DNA, as, for example, integrated DNA often exists as large concatemers. A role for the E2 protein can be deduced from observations that, typically, the rate of CpG methylation through most of the LCR of HPV16 is by a factor of 2–3 lower than methylation of the four CpGs within the promoter-proximal E2 binding sites, suggesting that clones were selected that have eliminated the negative regulation of the E6 promoter by E2, as this repressor can now not bind anymore to its targets.

1.5 Effects of Papillomaviruses on the Cellular Epigenome

It is well established that extensive epigenomic changes are an intrinsic part of carcinogenesis of all tissues irrespective of their association with papillomaviruses (Sharma et al. 2010), and epigenetic changes contribute to carcinogenesis with a weight similar to that of mutations and aneuplodies. The same applies to cancer of the cervix (Wentzensen et al. 2009; Louvanto et al. 2015; Siegel et al. 2015), and those neoplasias, which have etiologies with and without HPVs such as anal and oral cancer (Hernandez et al. 2012; Jitesh et al. 2013). Although the cellular methylome of the same group of tumors may differ in the presence and the absence of HPVs (Sartor et al. 2011), there is no a priori need to assume that methylation may be affected by the functions of HPV gene products. Nevertheless, this may yet be the case, as the HPV-16 E7 oncoprotein was reported to associate in vitro and vivo with the DNA methyltransferase DNMT1 and to stimulate its activity (Burgers et al. 2007). This observation opens up the possibility that this epigenetic effect directly influences cellular proliferation pathways. Subsequent studies proposed as a consequence of this mechanism suppression of E-cadherin expression and reduced adhesion between squamous epithelial cells (Laurson et al. 2010; D’Costa et al. 2012) and extended the effect to interactions of both E6 and E7 protein with components of the histone modification machinery (Bodily et al. 2011; Hsu et al. 2012).

1.6 Differential Methylation of HPV Genomes in Malignant Lesions

It is known since the early days of HPV research in the 1980s that HPV genomes in cancer frequently exist in a form recombined with cellular DNA (Schwarz et al. 1985). It is now generally accepted that the transition from high-grade precursors (CIN III) to invasive carcinomas is accompanied by and possibly caused by this recombination (Mine et al. 2013), although it is still disputed whether all or only a subset of cancerous lesions contain HPV genomes in chromosomally recombined form (Kalantari et al. 2001; Peitsaro et al. 2002; Arias-Pulido et al. 2006; Bhattacharjee and Sengupta 2006; Brandsma et al. 2009; Snellenberg et al. 2012; Chaiwongkot et al. 2013; Mirabello et al. 2013; Bryant et al. 2014). Recombination can result in interruption of the early polycistronic E6-E7-E1-E2 transcription unit. Failure to express E2 stimulates oncoprotein expression due to a lack of negative feedback repression of E2 on the E6 promoter. Beyond this, mechanisms for eliminating remaining episomal HPV genomes have recently been proposed as essential for cervical carcinogenesis (Mine et al. 2013).

In malignant and high-grade premalignant lesions, likely due to recombination with the cellular chromosomes, HPV genomes clearly undergo substantial methylation beyond the levels observed for episomal genomes (exceeding for some CpG residues 50%) as confirmed for HPV16 (Kalantari et al. 2004, 2014; Bhattacharjee and Sengupta 2006; Brandsma et al. 2009; Sun et al. 2011; Vinokurova and Knebel Doeberitz 2011; Xi et al. 2011; Clarke et al. 2012; Patel et al. 2012; Mirabello et al. 2013; Park et al. 2011; Verhoef et al. 2014; Frimer et al. 2015), HPV18 (Badal et al. 2004; Turan et al. 2006; Wentzensen et al. 2012; Kalantari et al. 2014; Vasiljevic et al. 2014), HPV31 (Wentzensen et al. 2012; Kalantari et al. 2014; Vasiljevic et al. 2014), HPV33 (Vasiljevic et al. 2014), HPV45 (Wentzensen et al. 2012; Kalantari et al. 2014), HPV52, and HPV58 (Murakami et al. 2013). For HPV16, this was reported not only for cervical but also vulval (Bryant et al. 2014), penile (Kalantari et al. 2008b), oral (Balderas-Loaeza et al. 2007), and anal cancer (Wiley et al. 2005; Hernandez et al. 2012). Methylation is relatively low in the LCR (which, together with the use of methylation sensitive restriction enzymes as opposed to bisulfite sequencing, led to an original misinterpretation of this mechanism, Badal et al. 2003), and is highest at certain CpGs in the late genes L2 and L1 (Brandsma et al. 2014; Mirabello et al. 2015).

Findings of increased methylation of HPV genomes correlating with the increasing severity of the lesion (from CIN I through CIN III to invasive cancer) were surprising and against intuition, as DNA methylation is normally seen as a transcription repression mechanism. The resolution of this contradiction came from two sources. Van Tine et al. (2004) reported in situ studies that cervical tumors typically contain numerous (i.e., up to a few hundred) HPV genome copies. All of these viral genomes are transcriptionally inactive, except one, which is the only source of E6 and E7 oncogene transcripts. In other words, some selective methylation mechanism targets these recombinant HPV genomes. Should all of them become methylated, HPV transcription would end, and such a clone would never grow into a detectable tumor. Only cells with one or few transcriptionally active HPV genomes grow into a detectable lesion.

This mechanism was further confirmed with the study of two cervical cancer cell lines, SiHa and CaSki. SiHa cells contain a single chromosomally recombined HPV16 genome, whose LCR is unmethylated and therefore transcriptionally active. CaSki cells contain about 500 HPV16 genomes, but generate a similar level of transcripts as SiHa cells. Not surprisingly, all HPV16 genomes in CaSki cells except one are methylated and transcriptionally inactive, oncogene transcripts being generated from the only unmethylated viral genome (Kalantari et al. 2004). However, it is not a necessary condition that most HPV genomes become methylated. The well-known cell line HeLa had been derived from a cervical adenocarcinoma and was shown to contain about 50 chromosomally recombined copies of HPV18 DNA. The analysis of its HPV18 genomes showed that the LCR and the E6 gene are generally not methylated and remain transcriptionally active (Johannsen and Lambert 2013), while parts of the genome that are upstream of the LCR, such as the L1 genes are heavily methylated (Turan et al. 2007).

It is unknown why chromosomally recombined HPV genomes become preferentially methylated. HPV DNA may be targeted by a methylation mechanism affecting all foreign DNA in mammalian cells (Dörfler et al. 2001). More recently a view emerged that the methylated state of DNA may be quite in general the default state of the hosts chromosomal DNA to lock genes in an off position (Edwards et al. 2010; Schuebeler 2015).

1.7 HPV Epigenomes and Cellular Epigenomic Properties of HPV Infected Cells as Cancer Biomarkers

Cancer of the cervix affects about 500,000 women every year, and about half of these die of this disease. It is the most prevalent cancer in women in many developing nations, but its incidence has been reduced in developed nations, to a large part through early diagnosis of precancerous lesions and surgical intervention. From the 1950s to the 1990s, diagnosis was mostly based on the Papanicolaou test (Pap test), which can be complemented with colposcopic observation of lesions. The Pap test is a staining test of a cervical smear obtained during a gynecological examination, which was developed without knowledge of the viral etiology of cervical cancer. The Pap test is a tremendous public health success, but it is less than satisfactory as it has a high rate of false negative diagnoses, as it misses many lesions. Since HPV infections are the sole underlying cause of precancerous cervical neoplasia, HPV DNA detection has become a valuable tool to amend or replace the Pap test. However, many women are carriers of HPV infections, which never progress toward malignancies. At this time, the best practice is to administer both a Pap test and an HPV DNA test on a patient, as well as interpreting the outcome in the context of the age and the previous diagnostic history of the patient (Saslow et al. 2012).

From these considerations, it is obvious that the triage of women with a positive Pap test or positive for HPV infection would benefit from the development of tests based on novel biomarkers. Detection of DNA methylation has the potential to be such a biomarker, whose detection can be technically standardized and made capable for high-throughput processing. This chapter has discussed that HPV DNA is either unmethylated or lowly methylated in asymptomatic infections and precancerous CIN I lesion while heavily methylated in cancer, an increase that begins in high-grade precursor lesions (CIN III). Methylation is particularly high at certain CpG dinucleotides in the late genes L1 and L2, identifying the best targets for HPV methylation analysis (Brandsma et al. 2009). A highly sensitive detection of these methylation changes may help to separate patients with malignantly progressing cervical lesions from those not undergoing such changes, as evaluated recently (Brandsma et al. 2014). In order to eliminate the time consuming DNA sequencing, HPV18 DNA methylation could be efficiently detected by PCR with methylation specific primers or with real-time PCR (Turan et al. 2007). As an alternative improvement toward clinical application, it has been shown that next-generation sequencing allows the establishment of the whole HPV16 methylome and eliminates laborious purification of PCR amplicons. Alternatively, pyrosequencing can target segments of HPV genomes of specific relevance for diagnosis. The same publication confirmed that high-grade precursors had a higher methylation than low-grade precursors, the decisive criterion for the usefulness to detect lesions likely to progress toward cancer (Mirabello et al. 2015). Beyond the analysis of the HPV genome, specific cellular genes such as DAPK and RARB are frequently methylated in cervical cancer (Wentzensen et al. 2009), and it may strengthen epigenomic testing to combine the measurement of HPV DNA methylation with that of the methylation status of such cellular genes (Sartor et al. 2011; Johannsen and Lambert 2013; Kalantari et al. 2014; Louvanto et al. 2015; Siegel et al. 2015). At this point, the utility of HPV methylation deserves to be further studied as a strategy to identify women at high risk for cervix cancer.