Introduction

5-Methylcytosine, present at 70–80% of all CpG dinucleotides, is the major modified base found in mammalian DNA. It has been known for some time that the level of 5-methylcytosine is globally reduced in tumor tissues relative to corresponding normal tissues [14, 16, 36], mostly due to hypomethylation of different classes of repetitive DNA sequences [34]. It was also observed that gene-specific hypermethylation at CpG-rich, so-called CpG-island sequences occurs specifically in cancer tissues [2]. CpG island hypermethylation is a phenomenon commonly observed during the development and progression of human tumors [21]. In the 1990s, researchers reported hypermethylation of CpG islands of several known tumor suppressor genes and other genes involved in important growth control or genome defense pathways, for example DNA repair genes [8, 12, 17, 20, 21, 23, 24, 27]. Today, there are many reports documenting methylation of CpG islands associated with a large number of different genes, including almost every type of human malignancy. In lung cancer, many specific CpG islands are methylated, for example those associated with the genes CDKN2A, RASSF1A, RARbeta, MGMT, GSTP1, CDH13, APC, DAPK, TIMP3, and a number of others [1, 9, 11, 42, 47, 49]. The methylation frequency of a particular CpG island (defined as the percentage of tumors analyzed in a study that carry methylated alleles) generally ranges from less than 10% to over 80% of the tumors depending on the histological subtype of tumor, the study population, and/or the methodology used to assess and quantitate DNA methylation.

Detection of methylated CpG islands in accessible biological materials such as serum or sputum has the recognized potential to be useful for the early detection and diagnosis of cancer, including lung cancer [3, 25, 43]. Highly sensitive techniques that can clearly detect the methylated tumor-associated DNA fragments among a large excess of unmethylated molecules are necessary to accomplish this goal. Most useful as DNA methylation markers would only be those genes that show close to background levels of methylation in normal human tissues and those that are methylated in a large fraction of the tumors.

Current research approaches in cancer epigenetics are focused on the characterization of the complete set of DNA methylation changes in cancer. Several techniques have been introduced and have been summarized comprehensively [13]. We recently developed a useful method, the methylated CpG island recovery assay (MIRA) that—unlike most other approaches—does not depend on the use of sodium bisulfite, restriction enzymes, or antibodies for identifying the methylated regions [31]. The MIRA method is based on the high affinity of the MBD2b/MBD3L1 protein complex for methylated CpG dinucleotides. For efficient pull-down of methylated DNA by this method, two or more methylated CpG sites in a DNA fragment of 50 base pairs or less are required [32]. MIRA is compatible with microarray analysis [32] or high-throughput DNA sequencing [35]. We have used the MIRA method in combination with high-resolution CpG island microarrays to characterize the full extent of DNA methylation changes that occur at CpG islands in non-small cell lung cancer tissues.

DNA methylation analysis of lung cancer

To analyze tumor-associated DNA methylation changes, we compared stage I lung squamous cell carcinomas (SCCs) or adenocarcinomas (AC) to normal matched lung tissues [34]. We used the MIRA-assisted microarray method for DNA methylation analysis [32, 33]. MIRA-enriched methylated DNA fractions obtained from tumor tissue and from matching normal lung tissue removed with surgery were analyzed on Agilent CpG island arrays covering a total of 27,800 CpG islands.

A complete set of hypermethylated CpG islands and discovery of new DNA methylation biomarkers

Five stage I squamous cell carcinomas and eight stage I adenocarcinomas of the lung were analyzed on these arrays along with matched normal lung. Using the criteria and cutoffs defined previously [34], the number of methylated CpG islands ranged from 216 to 744 in the five individual squamous cell tumors (Table 1). For adenocarcinomas, between 219 and 908 CpG islands were methylated per tumor (Table 1).

Table 1 Number of methylated CpG islands in stage I lung adenocarcinoma (AC) and squamous cell carcinomas (SCC)

Squamous cell carcinomas

Using MIRA-assisted microarray analysis, we identified 36 CpG islands that were methylated in all of the five SCC tumors analyzed (Fig. 1 and Table 2). A large fraction of the methylated CpG islands were mapped to homeobox genes in the genome. Since these 36 loci had excellent potential to be specific and sensitive methylation biomarkers for SCC, we analyzed 12 randomly chosen markers (BARHL2, EVX2, IRX2, MEIS1, MSX1, NR2E1, OC2, OSR1, OTX1, PAX6, TFAP2A, and ZNF577) in a larger series of 20 SCCs [34] by bisulfite-based COBRA assays [46]. This assay is semi-quantitative, technically robust and is commonly used for assessing the methylation status of CpG islands in smaller to medium size sample series; this method has a very low rate of false positive results. The methylation frequency of the individual markers ranged from 17/20 (=85%) to 20/20 (=100%) of the tumors. The OTX1 and NR2E1 associated CpG islands were methylated in all SCC tumors tested (=100%). Several of these SCC markers were highly specific for tumor-associated methylation, i.e., little or no detectable methylation was observed in tumor-adjacent normal lung tissue or in the lungs of non-tumor patients. These included the CpG islands of the OTX1, BARHL2, MEIS1, OC2, TFAP2A, and EVX2 genes [34]. None of these CpG islands was methylated substantially in white blood cell DNA from healthy individuals.

Fig. 1
figure 1

Methylation of CpG islands in lung squamous cell carcinomas. The red bars indicate methylation of individual CpG islands across a series of five stage I lung squamous cell carcinomas. The CpG islands methylated in all five tumors are marked by arrows

Table 2 List of hypermethylated CpG islands as potential markers for stage 1 lung squamous cell carcinoma (SCC)

Adenocarcinomas

Using MIRA-assisted microarray analysis, we identified 52 CpG islands that were methylated in at least six out of eight adenocarcinomas (Table 3). Several of these adenocarcinoma methylation markers (CHAD, DLX4, GRIK2, KCNG3, NR2E1, OSR1, OTX1, OTX2, PROX1, RUNX1, and VAX1) were chosen at random for verification by bisulfite-based COBRA assays. These selected adenocarcinoma markers were methylated in more than 80% of the tumors (Fig. 2). The CHAD gene was methylated in 8 of 11 tumors tested (data not shown). None of these CpG islands was substantially methylated in blood DNA from healthy individuals or in non-cancerous lung DNA. The tumor specificity and high methylation frequency of these genes makes them excellent candidates for use in future diagnostic applications for early detection of lung cancer.

Table 3 Methylation markers for lung adenocarcinoma
Fig. 2
figure 2

Verification of DNA methylation markers in normal lung tissue and in matching adenocarcinoma samples. Methylation differences between adenocarcinomas (T) and matching normal pairs (N) were detected by COBRA assays for the indicated gene targets. “–” refers to control digestion with no BstUI, “+” refers to BstUI-digested samples. Digestion by BstUI indicates methylation of the sequence tested (5′CGCG). The indicated CpG islands were analyzed (see Table 3 for chromosomal location of the CpG islands)

Methylation of FAT4 encoding a component of the Hippo tumor suppressor pathway in lung adenocarcinomas

One of the methylated targets in adenocarcinomas and SCCs was the gene FAT4. To confirm the microarray results, we analyzed methylation of the FAT4-associated promoter CpG island in 18 stage I adenocarcinomas and matched normal lung tissue (Fig. 3a). FAT4 was methylated in 7/18 tumors (=39%). We also confirmed that FAT4 is silenced in a series of non-small cell lung tumors compared to corresponding normal tissue when a cDNA panel from Origene, which contains various types of non-small cell lung tumors of different stages (stage I and II), was used. According to this RT-PCR analysis, expression of FAT4 was reduced in 18 of 23 stages I and II lung tumors relative to matched normal lung tissue (Fig. 3b) suggesting that methylation of this gene is perhaps only one mechanism leading to reduced expression in tumors. FAT4 encodes the closest mammalian homologue of the Drosophila Fat protocadherin (37% identical at the amino acid level). Fat is an upstream component of the Hippo pathway in flies [6]. The Hippo pathway is a signaling cascade involved in organ size control, tumor suppression, and apoptosis [29, 39]. Mutations of this pathway in mice has been linked to tumorigenesis [26, 40, 48]. However, it has been difficult to clearly demonstrate functional inactivation of this tumor suppressor pathway in human tumors. Mutations in Hippo pathway components, such as the MST and LATS kinases, are very infrequent. Epigenetic inactivation is seen most commonly for RASSF1A, a regulator of MST kinases [18, 19, 30] and sometimes for MST genes themselves [38]. In lung tumors, RASSF1A is most frequently inactivated by promoter methylation in small cell lung cancers [10]. In non-small cell lung cancers, RASSF1A methylation occurs at a frequency of 30–40% in adenocarcinomas and squamous cell carcinomas [5, 9]. We have not yet tested if methylation of RASSF1A and FAT4, both being upstream regulators of the Hippo pathway, are mutually exclusive or cooperate in lung tumors. Interestingly, Fat4 has recently been identified as a susceptibility gene for pulmonary adenomas in mice [4]. Thus, methylation of the promoter of FAT4 may have functional consequences for tumor initiation or progression in the lung. In this sense, methylation of FAT4 can be considered a potential epigenetic “driver” event for tumorigenesis rather than just a “passenger” event that would be without functional consequences [22].

Fig. 3
figure 3

Methylation of the FAT4 gene in lung adenocarcinomas. a Methylation of the FAT4 promoter-associated CpG island was tested in 18 stage I lung adenocarcinomas (T) and matched normal lung tissue (N). Methylation differences were detected by COBRA assays. “–” refers to control digestion with no BstUI, “+” refers to BstUI-digested samples. Digestion by BstUI indicates methylation of the sequence tested (5′CGCG), which was found in seven of the tumors. b FAT4 silencing in a series of non-small cell lung tumors. Normalized cDNA from 23 paired lung tumor/normal samples were tested with TaqMan qRT-PCR and the expression level of FAT4 was derived from comparing with β-actin internal control with two highly expressed samples set as 1

DNA methylation markers for lung cancer

The novel aspect of our work has been the comprehensive methylation analysis of all CpG islands in human lung cancer using microarrays [3234]. We were able to directly measure the methylation levels at over 27,000 CpG islands and found that between approximately 200 and 900 of these islands were methylated in individual lung SCC and AC samples (Table 1). These numbers are compatible with earlier estimates derived from analysis of only a subset of CpG islands methylated in cancer [7]. It is clear, of course, that not all of these genes can be tumor suppressor genes or in other ways can be driver events for tumorigenesis. For example, our earlier observations indicated that a substantial subset of the methylated genes (20–40% depending on the individual tumor) were homeobox genes [32, 33]. Homeobox genes are regulated by the Polycomb complex, and it is a common observation that Polycomb-associated genes are frequently methylated in cancer [28, 33, 37, 41, 44, 45]. Homeobox gene-associated CpG islands were among the most useful DNA methylation markers identified in lung cancer. The CpG islands of the OTX1, BARHL2, MEIS1, OC2, TFAP2A, and EVX2 genes were tumor-specifically methylated in SCC with little or no detectable methylation seen in normal lung tissue or in blood DNA [34]. Importantly, the methylation frequency of these markers (85% to 100% of the tumors were methylated) is much higher than methylation frequencies of most other lung cancer DNA methylation markers reported previously. For example, OTX1 was tumor specifically methylated in 20/20 (=100%) of the SCC tumors. For adenocarcinomas, several promising markers have been identified including CHAD, DLX4, GRIK2, KCNG3, NR2E1, OSR1, OTX1, OTX2, PROX1, RUNX1, and VAX1. Methylation of these genes in lung cancer has not yet been reported, except for RUNX1 [15]. Very commonly, frequently methylated genes are methylated in both subtypes of non-small cell lung cancers. For example, the CpG islands associated with the NR2E1, OSR1, and OTX1 genes were methylated in both adenocarcinomas and squamous cell carcinomas at a frequency of over 95%. These markers are excellent candidates for future clinical or diagnostic applications aimed at either detection of early disease in body fluids such as blood or sputum, or at disease management and follow-up using molecular diagnostic testing.