Introduction

Despite advances in our understanding of the molecular biology of GBM, one of the most aggressive human cancers, these tumors are still incurable. Median survival rates range from 12 to 15 months [1, 2]. GBMs almost universally recur after conventional therapies, including maximal surgical resection, radiation, and chemotherapy [3]. Despite the progress made in tumor imaging, clinical management of GBM remains difficult and the prognosis for this tumor is at the extreme worst end because of its high-grade status [4, 5]. This highlights the urgent need for sensitive, personalized biomarkers to accurately monitor residual and recurrent tumors and enhance the clinical management of GBM patients. There is a strong interest in exploiting somatic mutations, which occur exclusively in the tumor, to develop such biomarker. One such mutation is the EGFRvIII deletion. Approximately 33 % of all high-grade gliomas express EGFRvIII, and it is a bona fide tumor-specific antigen with potent oncogenic properties [6]. It results from an in-frame deletion of 801 bp spanning exons 2 to 7 of the coding region of EGFR and leads to ligand-independent tyrosine kinase activity that activates persistent downstream phosphatidylinositol 3-kinase (PI3-K) pathway [7]. Virtually no EGFRvIII positive patient survives for 2 years, versus about 15 % of those who are negative. Currently, considerable effort is being put into the development of anti-EGFRvIII agents but no biomarkers are available to monitor their efficacy [810].

It has been shown that tumor-derived mutant DNA can be detected in the cell-free fraction of the blood of individuals with cancer [1113]. Somatic rearrangements have been shown to occur frequently in GBMs and these mutations have the potential to serve as highly sensitive biomarkers for tumor detection. Rearrangement-associated biomarkers therefore offer a reliable measure that would be useful for monitoring tumor response to specific therapies, detecting residual disease after surgery, and long-term clinical management. We tested the feasibility of this strategy in a pilot study that involved 13 GBM patients. The use of genomic DNA to detect the EGFRvIII mutation is complex due to the presence of several recombination sites in intron I (123 kB) and in intron seven of EGFR gene. These sites are involved in DNA recombination events that generate genomic deletions of varying sizes leading to structural differences between GBM patients. We therefore used a long range PCR amplification technique to detect the EGFRvIII deletions and determined the deletions breakpoints using genomic DNA from the tumor and from white blood cells (WBC). The data suggest that the amount of circulating mutant EGFRvIII DNA correlates with the extent of the tumor resection and, when validated in a larger cohort, could be used as a noninvasive biomarker to monitor disease status in patients on treatment.

Materials and methods

Patients and tumor samples

This study was conducted under an Institutional Review Board (IRB) approved protocol. Thirteen patients newly diagnosed with GBM and planned for surgery at the University of Cincinnati Hospital were consented to obtain tumor tissue as well as immediate pre-operative blood and 3 weeks delayed post-operative blood. Blood samples were processed within 2 h for plasma separation and WBC isolation. Plasma, WBC and tumor tissue snap-frozen in OCT were stored at −85 °C until used.

RNA extraction and RT-PCR amplification

Total RNA was extracted from about 3 mm2 sections using “illustra triplePrep Kit” (GE Healthcare Bio-Sciences Corp). Complementary DNA (cDNA) was reverse transcribed from RNA in a 20 μL volume reactions using the “iScript cDNA Synthesis Kit” (Bio-Rad Laboratories, Inc, Hercules, CA) according to the manufacture’s protocol. The resulting cDNA was used in PCR amplifications using “GoTaq Green Master Mix” from Promega Corporation (Madison, WI) to determine the EGFRvIII status for each tumor tissue. A forward primer from exon 1 (5′-CTCTTCGGGGAGCAGCGATGC-3′) and a reverse primer from exon 9 (5′-CCACACAGCAAAGCAGAAAC-3′) of EGFR gene were used in the reaction (IDT Integrated DNA Technologies, Coralville, IA). Approximately, 30 ng of cDNA and 130 ng of primers were used in the PCR reaction. The resulting products were subject to Sanger sequencing using the same forward and reverse primers that yielded the PCR product to validate the EGFRvIII status.

Genomic tumor DNA extraction and long range PCR amplification

Genomic DNA was extracted from the frozen tumor tissue using “illustra triplePrep Kit” (GE Healthcare Bio-Sciences Corp). For long range PCR amplification, two sets of forward primers: set A containing 12 primers and set B containing 11 primers were designed to be 5 kb apart from each other and spanning the length of intron 1 of EGFR gene. The reverse primer was placed in exon 8. PCR reactions on gDNA were carried out using “GoTaq Green Master Mix” from Promega Corporation (Madison, WI) supplemented with 0.5 μL of Crimson LongAmp Taq DNA polymerase (2,500 U/ml) from New England BioLabs Inc. The PCR products were gel purified and subject to Sanger sequencing to determine the deletion breakpoint.

Circulating DNA extraction and PCR amplification through the EGFRvIII deletion

Circulating DNA was extracted from plasma (including exosomes) using NucleoSpin Plasma XS kit from Macherey–Nagel GmbH&Co. Bethlehem, PA. About 0.4–0.5 μg of DNA was constantly obtained from 1 mL of plasma and 24 ng of circulating DNA was sufficient to detect the deletion by PCR. For deletion detection, primers flanking the breakpoints were designed to yield a fragment of about 300 bp. These primers were used to detect the deletion in the genomic DNA extracted from both the tumor and the plasma.

Whole genome sequencing

Randomly fragmented gDNA (~500 bp) was size-selected for the construction of the paired end tagged (PET) libraries [14]. The libraries were paired-end sequenced using an Illumina HiSeq platform with a readout length of 100 bp (Axeq Technologies, Macrogen Inc. Rockville, MD). About 34–37 gigabases (Gb) of sequence were mapped to the human reference sequence (RefSeq), with an average mapping coverage of 22–25-fold. The raw sequence data were aligned to a human RefSeq (hg19) using the Bowtie 2 Aligner [15]. Four different types of tumor-specific genomic structural variations (SVs), i.e. deletion (DEL), inversion (INV), intra- and interchromosomal translocation (ITX and CTX), were detected using Control-FREEC software [16] and confirmed using the integrative genomics viewer (IGV) [17]. The data analysis pipeline used in this study is represented in Fig. 1.

Fig. 1
figure 1

Whole genome sequencing data analysis pipeline. The data analysis pipeline used during this study allows detection of structural variations, single nucleotide polymorphisms as well as copy number variations

Results

Detection of GBM patients carrying the EGFRvIII deletion

The EGFRvIII variant is the result of a deletion of exons 2 through 7 of EGFR gene and results in a fusion of exon 1 and exon 8 (Fig. 2a). This deletion leads to the loss of 267 amino acids from the extracellular domain of EGFR protein and renders the mutant protein unable to bind to its ligand (Fig. 2b). To detect GBM patients that carry the EGFRvIII deletion, RNA was isolated from the tumors of 13 patients and was subject to a reverse transcription PCR (RT-PCR) to generate cDNA. Using a sense primer in exon 1 and an antisense primer in exon 9, PCR amplification shows that three patients (23 %) carry the EGFRvIII deletion. The wild type (WT) EGFR resulted in a band of approximately 1,150 bp; while the mutant variant resulted in a band of about 320 bp due to the presumed fusion of exon 1 and exon 8 (Fig. 2c, d). Sanger sequencing confirmed the fusion of exon 2 and exon 8 in patients 1, 7 and 9, while the other patients are wild type, similar to EGFR gene in the U373 GBM cell line control (Fig. 2d). Although EGFRvIII tumors are usually heterogeneous and contain EGFR wild type as well (as in tissue from patient 282 control), we only obtained the PCR band corresponding to the mutant EGFR in the tumors (Fig. 2d).

Fig. 2
figure 2

A schematic representation and detection of EGFRvIII deletion. a Genomic DNA structure of EGFRvIII mutant compared to wild type gene. b EGFRvIII protein showing the ligand binding domain deletion. c Sequencing result of wild type and mutant genes showing the fusion of exon 1 to exon 8 in the EGFRvIII cDNA. d Detection of EGFRvIII patients using RT-PCR

Detection of EGFRvIII deletions in the genomic DNA and determination of the breakpoints

The presence of 11 Alu sites [18] in intron I (123 Kb) and one Alu site in intron 7 of EGFR gene leads to genomic deletions of varying sizes leading to different EGFRvIII deletions in GBM patients. While these deletions are different at the genomic level, the mRNA is spliced the same way leading to the same truncated protein in all patients. We have developed a long range PCR-based strategy that uses forward primers that span the whole intron 1 and a reverse primer in exon 8 (Fig. 3a). These primers that are five Kb apart from each other allowed amplification of several PCR products in the patient’s genomic DNA but not in the constitutional DNA from WBC indicating potential EGFRvIII deletions (Fig. 3b, c, d). These PCR products were Sanger sequenced and their EGFRvIII status confirmed. We obtained two confirmed populations of EGFRvIII deletions for patients 1 and 7 and one population in patient 9. Surprisingly, in patient 7, one of the deletions didn’t involve a direct recombination between intron 1 and intron 7 in EGFR but it involved adjacent sequences to the EGFR gene, namely the area containing SEPT14 and SEC61G genes (Fig. 3e). Patients 1 and 9 however, showed an intragenic recombination between intron 1 and intron 7 (Fig. 3f). These recombinations and the resulting EGFRvIII deletions were confirmed using next generation sequencing of patient 7’s normal and tumor DNA.

Fig. 3
figure 3

Detection of EGFRvIII genomic deletions and determination of the breakpoints. a A schematic representation of the EGFRvIII genomic DNA showing the loss of exons 2 through 7 and also the location of the primers used in the long range PCR. b, c and d The result of long range PCR amplifications showing specific bands in patients 1, 7 and 9 but not in WBC. The desired bands produced using Set B of primers are shown. Asterisk indicates a nonspecific band. e Sanger sequencing showing that EGFRvIII deletion can also involve intergenic recombinations. f Intragenic recombination in patients 1 and 9

Confirmation of the EGFRvIII deletions using next generation pair-end sequencing

To confirm the identity of EGFRvIII deletions detected by long range PCR amplification, and to check whether other deletions were missed using our strategy, genomic DNA from patient 7 and the corresponding normal DNA from WBC were subject to whole genome sequencing using the Illumina GAII platform. As was seen with the long range PCR amplification, two separate deletions in intron I were detected and confirmed using IGV (Fig. 4). Figure 4a, show the start of the deletion in intron 1, while Fig. 4c shows the end of the deletion in intron 7. Not only that we were able to detect the start and end of each deletion but we were also able to confirm the involvement of the region around SEPT14 gene and SEC61G in the recombination as indicated by the rearrangement of these two domains in this patient (Fig. 4d, e). These findings confirm that our long range PCR strategy is efficient in detecting EGFRvIII deletions with very high confidence and can be used to detect the deletion in the genomic DNA without the need to sequence the whole genome which can be costly and time consuming.

Fig. 4
figure 4

Confirmation of the EGFRvIII deletion using next generation sequencing. a and b IGV viewer confirms the presence of two EGFRvIII deletion start sites in patient 7 in intron 1 and c. One stop site in intron 7. d and e One of the EGFRvIII deletions in patient 7 involved an intergenic recombination with sequences around SEPT14 and Sec61G genes

Tracking of the EGFRvIII deletion in the peripheral blood

To track the EGFRvIII deletion in the peripheral blood of the patients that carry this mutation and to evaluate whether the mutation can be used to monitor the status of the tumor, blood was collected from the patients shortly before surgery and at 3 weeks after surgery. Primers were designed around the deletions to generate a PCR fragment of about 300 bp when the deletion is present. In the wild type, the fragment is too large to be detected by conventional PCR and therefore, no PCR product is expected (Fig. 5a). As predicted, PCR amplification from genomic tumor DNA (gDNA) produced the expected size band while the wild type DNA from WBC didn’t (Fig. 5b). GAPDH was used as control. To check whether the amount of detected mutant DNA in the plasma can reflect the status of the tumor, we amplified the mutant DNA from the plasma of patients 1, 7 and 9. Patient 7 had an incomplete resection of the tumor while patients 1 and 9 had a complete resection. Very consistent with the tumor status in these 2 patients, patients 1 and 9 plasma had no circulating tumor DNA after surgery, while patient 7 showed a residual amount of tumor DNA reflecting the incomplete resection of the tumor in this patient. These data show that this strategy is promising in detecting the EGFRvIII deletions in genomic DNA and in tracking these deletions in the peripheral blood. However, it inevitably needs to be validated in a larger cohort of GBM patients before it can be used routinely in brain tumor management.

Fig. 5
figure 5

Detection and tracking of the EGFRvIII deletion in the plasma. a Detection strategy of the EGFRvIII deletion by PCR. b PCR amplification of the deletion from genomic DNA using primers adjacent to the breakpoint. c Detection of the EGFRvIII deletion in the plasma of patient 7 before and after surgery. d Detection of the EGFRvIII deletion in the plasma of patient 1 before and after surgery and e Detection of the EGFRvIII deletion in the plasma of patient 9 before and after surgery. The quality of the circulating tumor DNA is variable between patients due to a difference in time between the blood draw and DNA extraction

Discussion

Clinical management of human cancer is dependent on the accurate monitoring of residual and recurrent tumors. The evaluation of patient-specific translocations in leukemias and lymphomas has revolutionized diagnostics for these diseases. The measurement of circulating tumor DNA has transformed the management of chronic viral infections such as HIV and the development of analogous markers for individuals with cancer could similarly enhance the management of their disease. DNA containing somatic mutations is highly tumor specific and thus, in theory, can provide optimum markers. Recent studies have evaluated the potential application of tumor-derived circulating DNA as a diagnostic tool for either early detection of systemic cancer, prediction of tumor progression, or as a means to monitor the response to therapy [1921]. In these studies the authors not only detected but also quantified the amount of extracellular mutant DNA in subject plasma and showed its usefulness for monitoring subjects with metastatic colorectal cancer [20]. In fact, 15 out of 16 subjects in whom free circulating tumor DNA was detectable suffered from relapse, whereas the subject without circulating tumor DNA did not experience tumor recurrence [19]. This was achieved by sequencing a panel of somatic mutations (in TP53, KRAS, APC and PIK3CA) in study subjects’ tumors and then quantifying the total amount of free circulating plasma DNA by real-time PCR [19].

Leary et al. [22] developed a method, called personalized analysis of rearranged ends (PARE), which can identify translocations in solid tumors. The authors analyzed four colorectal and two breast cancers with massively parallel sequencing and revealed an average of nine rearranged sequences per tumor. PCR with primers spanning the breakpoints was able to detect mutant DNA molecules present at levels lower than 0.001 % and readily identified mutated circulating DNA in patient plasma samples. This approach provides an exquisitely sensitive and broadly applicable approach for the development of personalized biomarkers to enhance the clinical management of cancer patients.

In this report, we decided to test whether this strategy can be applied to brain tumors carrying the EGFRvIII deletion. EGFRvIII deletion is tumor specific and is therefore, an ideal mutation to follow and quantify in the peripheral blood of patients on treatment. Detecting this mutation in the genomic DNA is however challenging as the deletion breakpoint is different from one patient to another. To address this problem, we developed a long range PCR amplification strategy that allows detection of all possible EGFRvIII deletions. We were able to detect the deletion breakpoints in all patients. The breakpoints were confirmed by Sanger sequencing and show that these deletions can be associated with the presence of an Alu recombination site or not and can involve a direct recombination between intron 1 and intron 7 of EGFR gene or an intergenic recombination involving sequences surrounding the EGFR gene.

To test whether our PCR strategy is efficient in detecting all the deletions present in these patients, we carried out whole genome sequencing of tumor and normal genomic DNA from patient 7. The data strongly supports our long range PCR results and confirms that our strategy can detect the EGFRvIII deletion with very high confidence. We then tested whether the detected mutations can predict tumor status in patients on treatment. To this end we analyzed the expression level of the tumor-derived DNA in the plasma of patients before and after surgery. As expected, the mutation was easily detected in the plasma before surgery but was completely absent in patients 1 and 9 who had a complete resection and barely visible in patient 7 who had an incomplete resection.

The data described in this report suggests that a long range PCR strategy can be successful in detecting EGFRvIII deletions without the need to sequence the whole genome of a patient. The data also suggests that quantification of the EGFRvIII deletion in the plasma can be a useful tool to monitor brain tumor dynamics in patients on treatment. Other deletions commonly present in brain tumors can be used the same way we used the EGFRvIII deletion. One such deletion is the CDKN2A in the 9p21.3 region which occurs in approximately 31–50 % of GBMs [1] and the ERRFI deletion in the 1p36.23 region which occurs in about 35 % of GBM tumors [23]. Besides large deletions, single nucleotide mutations in genes such as IDH1 [24], Tp53 [25] and PIK3CA [26] are also common in brain tumors and can be quantified in the plasma. Indeed, recent work has shown that detection of the IDH1(R132H) mutation in blood has a valuable diagnostic accuracy in GBM patients not amenable to biopsy [27].

No biomarkers are currently available to follow brain tumor patients on treatment. The tumor-specific EGFRvIII mutant protein was detected in serum microvesicles from glioblastoma patients [28] and could represent another promising diagnostic strategy. However, using RNA extracted from microvesicles may have limitations in identifying all EGFRvIII patients. Indeed this strategy detected the mutation in only 40 % of confirmed EGFRvIII patients. Given the noninvasive nature of the circulating mutant DNA analysis, it is a worthwhile field for future investigation. The strategy developed herein for EGFRvIII supports the feasibly of this approach. While this strategy is promising, it inevitably needs to be validated in a larger cohort of brain tumor patients before it can be used routinely in brain tumor management. Indeed, a study including a large cohort is underway to further ascertain the use of this strategy to follow disease progression in patients on treatment and also to compare the sensitivity of this assay to the current imaging modalities used in the clinic.