Introduction

Head and neck squamous cell carcinoma (HNSCC) has traditionally been associated with alcohol and tobacco use. Recently there has been evidence for a new etiology of HNSCC involving human papillomavirus (HPV) infection. Whereas HPV-negative HNSCC has been on the decline (down 50 % between 1984 and 2004), the incidence of HPV-positive HNSCC has increased by 225 % over the same period [1]. Tumors result from infection by a small group of high risk HPV subtypes, including HPV16 which accounts for 90 % of HPV-positive [2]. HPV-positive tumors are primarily located in the oropharynx, often non-keratinizing and poorly differentiated, and more likely to be present with regional lymph node metastasis. In spite of these more unfavorable characteristics, HPV-positive tumors demonstrate increased local control as well as better disease-free and overall survival compared to HPV-negative HNSCC [3].

The success of current therapy against HPV-positive HNSCC has introduced the idea of therapy de-escalation. Concomitant chemoradiation therapy continues to be the mainstay of HNSCC treatment; however, the treatment is associated with significant morbidity [4] despite development of intensity-modulated radiation therapy (IMRT) and other radiation field and dose modifications, as well as protocols addressing alternative chemotherapeutic agents with fewer side effects. Current standard of care is not dependent upon HPV status. If these HPV-positive patients truly are more responsive, it may be possible to reduce the amount of chemotherapy or use irradiation exclusively in order to minimize the associated side effects.

Despite the fact that HPV-positive HNSCC patients show an improved response to therapy, there is a small minority of patients who do not respond as well. In order to optimize treatment and avoid unnecessary morbidity, it would be highly beneficial to develop biomarkers that could individualize medicine for this sub-population. In addition, identification of patients who respond positively would enable personalized therapy with the possibility of treatment de-escalation. In this way, treatment tailored to each individual patient could target patients most likely to benefit from the suggested regimen. However, none of the published studies attempt to differentiate HPV-positive patients based upon their response to therapy. This pilot study aims to identify biomarkers that could differentiate between HPV-positive HNSCC patients who respond and do not respond to current therapy standards.

Materials and Methods

Patients and Specimens

Samples were obtained from pre-treatment biopsies and salvage surgery specimens from patients managed through the Multidisciplinary Head and Neck Clinic at William Beaumont Hospital, Royal Oak, Michigan. Patients were consented by Beaumont BioBank clinical staff using an IRB approved protocol (HIC 2008-180), and samples were processed and stored at −80 °C using standard operating procedures. The majority of samples were processed and stored within 30 min.

RNA Isolation

RNA was isolated from fresh frozen or OTC-preserved tissue using RNeasy mini or micro kits (Qiagen, Valencia, CA) which included DNase treatment. Tissue was minced and homogenized into lysis buffer using a polytron tissue homogenizer (Thermo Fisher Scientific Inc., Waltham, MA) with subsequent passage through a Qiashredder (Qiagen) column. RNA was quantitated using a Nanodrop spectrophotomer (Thermo Fisher Scientific Inc.) and quality determined on an Agilent Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA).

HPV16 Screening

Given that testing for HPV is not a routine clinical procedure, HPV status was determined using a PCR-based (Eppendorf Realplex Mastercycler; Hauppauge, NY) screening method. To qualify as HPV-positive, the HPV16-specific E7 gene had to be detected in both DNA and RNA. Due to limited quantities, RNA underwent a preliminary amplification step. Total RNA was reverse transcribed using a Superscript III cDNA synthesis kit (Invitrogen; Grand Island, NY) following manufacturer’s protocol. cDNA was then pre-amplified using TaqMan PreAmp Master Mix Kit (Applied Biosystems; Grand Island, NY) with a pool of GAPDH and HPV16-E7 primers. The conditions for the pre-amplification were: 95 °C for 10 min then 14 cycles of 95 °C for 15 s/60 °C for 4 min. Samples were then diluted, and the PCR reaction mixture was prepared containing substrate (gDNA or cDNA), TaqMan Gene Expression Master Mix (Applied Biosystems), and either HPV16 E7 or GAPDH primers. The following thermocycling conditions were used: 50 °C for 2 min, 95 °C for 2 min, then 40 cycles of 95 °C for 15 s/60 °C for 1 min. The following primers and probes were used for HPV16 E7 [5] (Integrated DNA Technologies Inc., San Diego, CA): 5′-CGAATGTCTACGTGTGTGCTTTG-3′, 5′-CCGGACAGAGCCCATTACAA-3′, and probe 5′-CGCACAACCGAAGCGTAGAGTCACACT-3′. Primers and probes for GAPDH were ordered as predesigned assays (Assay Hs99999905_m1; Applied Biosystems).

Microarray Preparation and Analysis

Total RNA from each sample was labeled using the Ambion WT Expression Kit (Ambion Inc., Austin, TX), which uses a priming method that specifically primes non-ribosomal RNA and provides complete and unbiased coverage of the transcriptome while significantly reducing the priming of rRNA. End labeling was performed according to the Affymetrix GeneChip® WT terminal labeling and hybridization protocol. Human Exon 1.0 ST arrays (Affymetrix, Santa Clara, CA) were hybridized overnight, washed and stained on a GeneChip® Fluidics Station 450 using standard protocols, then scanned using a GeneChip®, Affymetrix 3000 7G scanner.

The CEL files containing the raw intensity data from the Affymetrix GeneChip arrays were imported into Partek Genomics Suite (version 6.6 beta, build 6.12.0207; Partek Inc., St. Louis, MO) and normalized using the robust multichip average with a guanine-cytosine content background correction, quantile normalization, log2-transformation, and median polish probe set summarization. Exon-level data was summarized to genes using the average of the probe sets. Differentially expressed genes were identified using ANOVA with two factors: tumor grade and scan date (random effect). Hierarchical clustering was completed with Partek® software using Euclidean distance as similarity measure and average linkage for the agglomerative method. Categorization and pathway analysis were performed using Pathway Studio (Ariadne Genomics, Rockville, MD). The data discussed in this publication accessible through NCBI’s Gene Expression Omnibus [6] (GEO series accession number GSE40020).

Publicly available datasets from NCBI’s GEO (GSE55544 and GSE55546 were analyzed for differentially expressed genes using the GEO2R analysis tool built into the GEO website (http://www.ncbi.nlm.nih.gov/geo/geo2r). Genes were identified as differentially expressed using a false discovery rate of 5 %.

Quantitative Real-Time PCR

Gene expression levels were validated using a Realplex Mastercycler (Eppendorf, Hauppauge, NY). The following pre-designed TaqMan gene specific primers (Applied Biosystems, Grand Island, NY) were used: LCE3D (Assay ID Hs00754375_s1), KRTDAP (Assay ID Hs00415563_m1), HMOX1 (Assay ID Hs01110250_m1), KRT19 (Assay ID Hs00761767_s1), MDK (Assay ID Hs00171064_m1), TSPAN1 (Assay ID Hs00371661_m1), and GAPDH (Assay ID Hs99999905_m1). Assay was performed on cDNA generated by the WT amplification of RNA described above. Quantitative real-time PCR reaction mixture was prepared containing cDNA, TaqMan Gene Expression Master Mix, and Gene Expression Assay (Applied Biosystems). The following thermocycling conditions were used: 50 °C for 2 min, 95 °C for 10 min, and 40 cycles of 95 °C for 15 s/60 °C for 1 min. The delta delta CT method was used for analysis. For fold-change calculations, CT values were limited to a maximum of 40.

Results

HPV Testing and Patient Characteristics

DNA and RNA isolated from each sample were analyzed with RT-PCR using primers specific to the E7 gene of HPV16. Nineteen samples were positive for the E7 gene at both the DNA and RNA level. All 19 patient samples were included in subsequent analysis. Supplemental Figure 1 illustrates these results. In addition to the 19 HPV-positive samples subsequently used in this study, other samples were analyzed that were negative for HPV RNA or both gDNA and RNA.

Patient demographics and lesion characteristics are included in Table 1. All patients received conventional intensity-modulated radiation therapy with or without additional treatment with Erbitux, Cisplatin, or other chemotherapeutic agents. In accordance with NCCN guidelines, chemoradiation is given to stages III and IV cancers of the head and neck, when offered as primary treatment. Radiation alone is reserved for stages I and II cancer. Patients were grouped into Complete Responders (CRs) and Post-treatment Failures (Post-Tx Fails); consistent with NCCN guidelines, the degree of response is determined 3 months post-treatment based upon CT, PET, and clinical evaluation. The average age of the CRs (64.3) and the Post-Tx Fails (62.0) are not significantly different. The primary site of all CRs was the oropharynx. The Post-Tx Fails were predominantly situated in the oropharynx, but 2 of the 7 (29 %) were in the oral cavity. Both the CRs and Post-Tx Fails ranged from Stages II–IV. Seven of 12 (58 %) CRs demonstrated positive lymph nodes at time of diagnosis, while 3 of the 7 (43 %) Post-Tx Fails had positive nodes at time of recurrence.

Table 1 Clinical data

Gene Expression Changes

The 19 samples of HPV-positive head and neck cancer were analyzed using the Affymetrix Exon arrays. There were 262 genes identified as showing significant changes in expression (p ≤ 0.05 and a 1.5-fold cutoff) between Post-Tx Fails and CRs (Supplemental Table 1), including upregulation of 114 genes in the Post-Tx Fails and downregulation of 148. The most highly upregulated genes in Post-Tx Fails included late cornified envelope 3D (LCE3D), keratinocyte differentiation-associated protein (KRTDAP), and heme oxygenase 1 (HMOX1). Among the most downregulated were tetraspanin 1 (TSPAN1), midkine (MDK), and keratin 19 (KRT19).

Samples from fourteen patients (7 CRs and 7 Post-Tx Fails) were used to validate the microarray results. RT-PCR confirmed the microarray results of all 6 genes (Fig. 1). The 5 CRs that were not included had been exhausted in the previous analyses. As seen previously [7], results from microarray analysis tend to underestimate the degree of change found by RT-PCR. MDK showed the greatest disparity between the microarray and RT-PCR results. Given that the fold-change is the ratio of the CRs and Post-Tx Fails, the absence of MDK detection by RT-PCR in 3 of the 7 Post-Tx Fails resulted in an inflated representation of fold-change.

Fig. 1
figure 1

Validation of microarray results utilizing RT-PCR. Values indicate fold-difference of Post-Tx Fails compared to CRs. Light gray microarray result; dark gray RT-PCR result

Ariadne Pathway Studio categorized the differentially expressed genes into sub-networks [8] that are generated based upon the results of MedScan, a literature mining program that searches publicly available literature such as PubMed for relationships between entities [9]. One sub-network categorizes genes based upon protein involvement in regulating specific cellular processes. Table 2 lists the top cellular processes identified by a Fisher’s exact test as overrepresented by the 262 differentially expressed genes. These highly represented categories consist of two main groups related to DNA damage and cell cycle.

Table 2 Sub-networks of genes/proteins regulating cell processes that are highly represented by genes that are differentially expressed between the Post-Tx Fails and CRs (p ≤ 0.05 and 1.5-fold cutoff)

Hierarchical Clustering

Hierarchical clustering was performed based upon the 49 genes differentially expressed at p ≤ 0.01 and a 1.5-fold cutoff. These 49 differentially expressed genes adequately delineated between the CRs and the Post-Tx Fails (Fig. 2). Only a single Post-Tx Fail patient (BN307) clustered with the CRs. The clustering of this patient is difficult to explain given the patient’s otherwise unfavorable characteristics: high tumor stage (stage IVA), involvement of contralateral lymph nodes, evidence of alcohol and extensive (100+ pack years) tobacco use, and synchronous lung cancer.

Fig. 2
figure 2

Hierarchical clustering based upon 49 genes that are differentially expressed between the Post-Tx Fails and CRs (p ≤ 0.01 and 1.5-fold cutoff). Patients are clustered on the horizontal axis (samples in orange correspond to the Post-Tx Fails; purple CRs)

Sub-Network and Pathway Analysis

In order to better elucidate differences in HPV-positive HNSCC based upon the response to chemoradiation, the microarray data was investigated to identify highly regulated expression pathway sub-networks [8, 10] using Ariadne Sub-Network Enrichment Analysis (SNEA). Expression pathway sub-network analysis consists of a single seed and proteins associated to this seed by either regulating expression of/by the seed or by binding to the promoter of/by the seed [11]. Three of the most highly regulated sub-networks between CRs and Post-Tx Fails were built around the interrelated seeds of E2F3, E2F4, and the general E2F family. The three E2F-associated sub-networks were combined, and the visualized sub-network was limited to include only those genes that were differentially expressed at ANOVA p ≤ 0.10 and a 1.2-fold cutoff (Fig. 3). Less stringent conditions have previously been shown to be more appropriate for pathway analysis [12, 13]. The combination of these sub-networks illustrates that genes associated with E2F are generally downregulated in patients who fail to respond to chemoradiation therapy.

Fig. 3
figure 3

Sub-network of genes regulated by E2F3, E2F4, and the E2F functional group. Pathway includes genes differentially expressed between Post-Tx Fails and CRs (ANOVA, p ≤ 0.10 and fold-change ≥1.2). Red upregulated in Post-Tx Fails compared to CRs; blue downregulated. p value and fold-change requirement not met for E2F3 and E2F4

Comparison to Publicly Available Data

Two microarray series from NCBI’s GEO were examined for comparison with the current results. Series GSE55544 includes data from HPV-positive and HPV-negative oral and oropharyngeal cancers. Series GSE55546 investigates HPV-positive oropharyngeal cancer as compared to benign uvula and tonsil. Utilizing the GEO2R tool to determine differentially expressed genes, there were six genes that were differentially expressed in both publicly available datasets and in the current results—mini-chromosome maintenance complex component 2 (MCM2), MCM3, MCM4, pituitary tumor-transforming 1 (PTTG1), RAD54 homolog B (RAD54B), and synovial sarcoma, X breakpoint 2 interacting protein (SSX2IP). All six genes are associated with genome stability, and four of these genes are found to be regulated by E2F or E2F4 (MCM2, MCM3, MCM4, and PTTG1—Fig. 3).

Discussion

Personalized medicine is defined by the National Institutes of Health as the “emerging practice of medicine that uses an individual’s genetic profile to guide decisions made in regard to the prevention, diagnosis, and treatment of disease.” Ideally biomarkers help tailor treatment to fit an individual based upon their ability to predict a patient’s prognosis or response to certain therapies. The goal of this pilot study was to identify biomarkers to differentiate HPV-positive HNSCC patients based upon their response to treatment, in order to identify an assay that could assist clinicians in choosing alternative treatment regimens if current standard of care would prove ineffective. Given the increasing incidence and improved response of HPV-positive head and neck patients, it is important to discern patients based upon response to current treatment modalities. While we have previously reported a set of biomarkers as defined by cDNA microarray analysis that can predict outcome of induction chemotherapy [14] as well as radiation based treatment in pre-treatment specimens alone, this earlier study did not discriminate between HPV-positive and HPV-negative HNSCC.

The present study identifies several potential biomarkers, including upregulation of LCE3D and KRTDAP in the Post-Tx Fails and downregulation of KRT19. LCE3D and KRTDAP are associated with differentiation and keratinization [15, 16]. In a study comparing HPV-positive and HPV-negative oropharyngeal carcinoma [17], Slebos et al. found that keratinization-associated proteins were enriched in HPV-negative carcinoma which is comparable to the current results where LCE3D and KRTDAP genes are upregulated in Post-Tx Fails. Among the most downregulated genes in Post-Tx Fails was keratin 19 (KRT19). KRT19 was identified as a variably expressed marker in oral SCC that was underexpressed in well-differentiated compared to poorly differentiated oral SCC [18].

In addition to specific gene expression changes that can be utilized as biomarkers, microarray-based analyses also reveal broad-spectrum differences between Post-Tx Fails and CRs. Analysis limited to the differentially expressed genes found that differences between the patient populations were concentrated in cell processes of genome stability, cell cycle, and DNA repair (Table 2). Closer investigation of these categories illustrated that the vast majority of associated differentially expressed genes are underexpressed in the Post-Tx Fails (Supplemental Table 2). SNEA, an analysis method that exploits all data on the array without predefined p value or fold-change limits, identified the sub-network of genes associated with cell cycle as demonstrating differences between the Post-Tx Fails and CRs. This suggests that differences observed in response to therapy may be due to alterations in cell cycle related gene expression. Additionally SNEA found sub-networks centered on genes associated with E2F, in particular E2F3 and E2F4, as highly regulated (Fig. 3). E2F3 protein binds specifically to retinoblastoma protein pRB in a cell cycle-dependent manner while E2F4 protein binds to the three tumor suppressor proteins pRB, p107 and p130. The association with pRB is thought—provoking given the overlap with HPV E7 signaling. The expectation is that the sub-network of genes associated with E2F would be upregulated due to the fact that E7 inhibits Rb function leaving E2F unsuppressed. That expectation should hold true for both CRs and Post-Tx Fails because all samples utilized in the current study express the viral E7 oncogene. The interesting part about the findings illustrated in Fig. 3 is that patients that do not respond to treatment do not show this expected activation. Closer examination of the sub-network reveals several groups of cell cycle associated genes: cyclins (CCNA2, CCNB1, CCND2, CCNE2), cyclin-dependent kinases (CDKN2A, CDKN2C), and mini-chromosome maintenance proteins that are involved in the initiation of eukaryotic genome replication (MCM2, MCM3, MCM4, MCM5, MCM6, MCM7, MCM8, MCM10). The vast majority of these cell cycle associated genes are downregulated in the Post-Tx Fail patients.

Interestingly these broad spectrum variations between responsive and non-responsive HPV+ HNSCC patients are comparable to those seen in a study of HNSCC recurrence [19]. Giri et al. examined a cohort of HNSCC patients that had undergone surgery with subsequent radiation treatment. In a comparison of patients that developed distant metastasis to those with no recurrence, differentially expressed genes were found to be associated with cell growth and proliferation, cell morphology and organization, cell cycle, and DNA repair. This corresponds well with the present results in HPV+ patients who demonstrated differences in cell cycle (including mitosis, cell cycle checkpoint, S phase), cell proliferation, and DNA repair. In particular, Giri et al. found that E2F4 was downregulated in HNSCC patients that later developed distant metastasis. While E2F4 was not recognized as differentially expressed in the HPV+ HNSCC patients, the signaling and expression targets of E2F4 were found to be downregulated in HPV+ HNSCC patients that have failed treatment (Fig. 3).

These results were further corroborated by public datasets available through NCBI’s Gene Expression Omnibus. While differences in microarray platform often gives rise to minimal overlap between studies (the current study utilized Affymetrix GeneChips while the public data was generated using Agilent technology), 6 genes were differentially expressed in the 2 public datasets as well as in our current study. The majority was involved in the E2F sub-networks, and all were related to genome stability. While neither public dataset focused on outcomes, one compared HPV-negative cancers which are known to have worse outcomes to HPV-positive cancers which generally have better outcomes. The other GEO dataset compared oropharyngeal tumors to normal. While not a direct extension of the current study, including comparisons to normal tissue as well as HPV-negative HNSCC provides further insight into the fundamental biology of HNSCC.

The primary deficiency of the study is the comparison of post-treatment samples in patients who failed treatment with pre-treatment samples of the complete responders. The rationale for this study design was that no treatment failures were available in the CR group, and the patients who had recurrences were referred to us from outside institutions where no pre-treatment samples to assay for biomarkers were available. While future studies will work to confirm the current results with pre-treatment samples from patients who fail to respond to treatment, there is validity to the current comparison. Ding et al. [20] proposed that recurrence following chemotherapy will have acquired mutations not found in the primary tumor that are important for relapse growth and resistance to chemotherapy. That is, treatment eliminates the bulk of the original heterogeneous tumor leaving the recurrence to propagate from the remaining clones that contain the important, relevant genetic changes. In this way, the recurrence following treatment may be enriched with the genetic changes that are truly important in the treatment failure. While the Ding study focused on gene mutations, the accrued mutations would theoretically produce downstream differences in gene and protein expression. It is these variances in gene expression that this study endeavored to demonstrate.

Additionally it is worth noting that several of the Post-Tx Fail cases were from sites outside of the oropharynx, the traditional site for HPV+ squamous cell carcinoma. HPV-positive tumors tend to be non-keratinizing and poorly differentiated, while sites outside of the oropharynx are often keratinizing squamous cell carcinoma. Of the four non-oropharyngeal Post-Tx Fail cases used in the current study all were oral cancers. Two of these were keratinizing, moderately differentiated squamous cell carcinoma; the others were non-keratinizing, poorly differentiated squamous cell carcinoma. While the identification of keratinocyte-associated LCE3D and KRTDAP as markers of response should be validated in a larger cohort, we hypothesize that the non-responsive HPV+ tumors may in fact show molecular characteristics similar to less treatable HNSCC such as HPV-negative and non-oropharyngeal cancers.

This pilot study initiates the investigation of differences between patients with HPV-positive head and neck cancer based on their response to chemoradiation. While a large percentage of these patients respond well to current treatment modalities, there is still a population that does not respond. In addition to identifying those patients who will not respond to treatment and thus would be better served by alternate treatment modalities, it will be important to identify the population that is likely to respond thus enabling a discussion of de-escalating therapy. The results of the current study identify specific genes as well as the cell cycle cell process as highly regulated between responders and non-responders and may serve as future biomarkers in HPV-positive HNSCC patients. If verified, these findings may initiate development of an assay that could assist clinicians in treatment decisions.