Introduction

Genomics has significantly advanced our knowledge of the biology of posterior fossa ependymoma [9,10,11, 14, 20, 23, 24, 26]. Posterior fossa ependymoma comprise three distinct molecular subgroups, defined by gene expression and genome-wide DNA methylation, and they display highly disparate demographics, cytogenetics, and outcomes [6, 11, 12, 14,15,16, 26]. PFA (PF-EPN-A) ependymomas arise in younger patients, primarily infants, and have generally a bland genome and a poor outcome. In contrast, PFB (PF-EPN-B) ependymomas occur in older children and adults, harbor multiple broad cytogenetic aberrations, and generally have a favorable outcome. The third molecular subgroup comprises mostly grade I subependymomas (PF-SE), which occur almost exclusively in adults only and have a good outcome, as well [14]. Currently, the standard of care for all posterior fossa ependymoma in patients under 18 is maximal safe surgical resection followed by conformal radiation to the tumor bed; however, in adults, adjuvant postoperative radiation is variable resulting in a subset of patients who are treated with surgery only [19].

Recently, we demonstrated that there are profound clinical differences and responses to therapy between PFA and PFB [14, 15, 26]. PFA have a poor outcome overall, with an even worse outcome when sub-totally resected. PFB have an excellent outcome overall, where a significant proportion of patients can be cured with a gross total resection alone. Therefore, patients with PFB ependymoma, particularly children are excellent candidates for radiation sparing strategies, where carefully controlled studies are being contemplated of observation alone in the setting of gross total resection. Gain of 1q has been proposed as a marker of poor prognosis in PFA ependymoma; however, its role in PFB is unclear [1, 4, 7, 8, 17, 25]. We have also recently shown that PFA ependymoma is highly heterogeneous entities with two major groups with multiple subtypes [13]. Moreover, PFB has been assumed to be a homogeneous group; however, it is unknown if there exists molecular and/or clinical heterogeneity and additional substructure within PFB ependymoma. As such, unraveling this heterogeneity will be a crucial starting point to the development of personalized therapies within PFB. We assembled the largest cohort to date of clinically annotated PFB ependymomas profiled using genome-wide DNA methylation arrays to assess molecular heterogeneity within the PFB ependymoma subgroup.

Methods

Patient cohort

Two hundred and twelve posterior fossa ependymomas were obtained through the GENE consortium (Global Ependymoma Network of Excellence), St. Jude’s Children’s Research Hospital, the Collaborative Ependymoma Research Network, the Burdenko Neurosurgical Institute, and the German Cancer Research Center. Both frozen and FFPE samples were collected from diagnosis. Samples were all collected in accordance with the Hospital for Sick Children Research Ethics Board and local institutional research ethics boards. All frozen samples were snap frozen and stored at − 80 °C. Formalin-fixed paraffin-embedded tissue was collected as scrolls or unstained slides. A subtotal resection was defined as more than 5 mm postoperative residual on the postoperative MRI.

Genome-wide DNA methylation profiling

Samples were analyzed on the Illumina Infinium HumanMethylation450 or the HumanMethylationEPIC beadchips at the PM-OICR Translational Genomics Laboratory (Toronto, Ontario) or the German Cancer Research Center (Heidelberg, Germany) according to the manufacturer’s instructions and as previously described [14]. All analysis was conducted in the R Statistical Environment (v3.4.1). Raw data files (.idat) generated by the Illumina iScan array scanner from both frozen and FFPE derived tissue were loaded and preprocessed using the minfi package (v1.22.1). Illumina preprocessing was selected to mimic the normalization performed in Illumina Genome Studio. To account for possible batch effects due to divergent protocols for fresh frozen and FFPE material, a batch adjustment was performed. Batch effects were estimated by fitting a linear model to the log2 transformed intensity values of the methylated and unmethylated channels. After removing the component due to the batch effect, the residuals were backtransformed to intensity scale, and methylation beta values were calculated as described in Illumina’s protocols. Subsequently, the following filtering criteria were applied: removal of probes targeting the X and Y chromosomes, removal of probes containing a single-nucleotide polymorphism (dbSNP132 Common) within five base pairs of and including the targeted CpG site, and probes not mapping uniquely to the human reference genome (hg19) allowing for one mismatch. In total, 431069 probes were kept for analysis. PFB status was determined using the Heidelberg brain tumor classifier (https://www.molecularneuropathology.org/mnp) and unsupervised hierarchical clustering as previously described [2, 14]. Raw and processed methylation data have been deposited into GEO under the accession number GSE117130.

Gene expression profiling

PFB ependymoma samples, for which gene expression profiles generated on the Affymetrix GeneChip Human Genome U133 Plus 2.0 Array (Affymetrix, Santa Clara, USA) at the Microarray Department of the University of Amsterdam, The Netherlands, were processed as previously described (GSE64415) [13, 14]. Briefly, data were normalized using the MAS5.0 algorithm, and differentially expressed genes between the five subtypes were generated within the R2: Genomics Analysis and Visualization Platform comparing one (http://r2.amc.nl). TMEV software was used to generate the heatmap displaying the top ten most upregulated genes per subtype [21]. Differentially expressed genes (adj. p < 0.05) when comparing one subtype versus others were used to performed pathway analysis with g:profiler [18].

Unsupervised t-SNE analysis

For unsupervised t-Distributed Stochastic Neighbor Embedding (t-SNE) analysis, we selected the 6749 most variably methylated probes across the data set (s.d. > 0.25). Pairwise sample distances were calculated using 1 minus the weighted Pearson correlation coefficient as the distance measure. Unweighted correlation coefficients provided similar results and as such weighted probes were used to reduce the noise. Pairwise Pearson correlation was calculated using the wtd.cors function of the weights package v.0.85. We used the probe standard deviation subtracted by 0.25 as the weight, giving more variable probes greater influence. The resulting distance matrix was used to perform the t-SNE analysis (Rtsne package v.0.13). The following non-default parameters were used: θ = 0, is_distance = T, pca = F, and max_iter = 2000. Clusters were annotated using the DBSCAN algorithm as implemented in the dbscan package v.1.1.1. The following non-default parameters were used: minPts = 10, eps = 2.4. Subsequently, samples not assigned to any cluster were iteratively merged to their nearest cluster.

Spectral clustering

The same weighted distance matrix used for unsupervised t-SNE analysis was used to perform spectral clustering. Spectral clustering implemented in the SNFtool package (v2.2.1) was run on the patient by patient similarity matrix to obtain groups corresponding to k = 2–8.

Copy number variation (CNV) analysis

Copy-number segmentation was performed from genome-wide methylation arrays using the conumee package (v0.99.4) in the R statistical environment (v3.3.3) as previously described [5, 22]. Broad copy-number events were determined using visual inspection of copy-number plots and significance of the frequency of each broad event was tested using the exact binomial test. Each broad event frequency was compared to the background frequency, which was determined from a robust regression of the observed frequencies with respect to gene content (i.e., number of RefSeq genes) across all chromosomes. This approach was motivated by GISTIC’s broad event analysis [3].

Statistical analysis

Progression-free survival and overall survival were analyzed by the Kaplan–Meier method, and p values reported using the log-rank test. Associations between covariates and risk groups were tested by the Fisher’s exact test. Univariable and multivariable cox proportional hazard regression was used to estimate hazard ratios including 95% confidence intervals. All statistical analyses were performed in the R statistical environment (v3.4.1), using R packages survival (v2.41-3), and ggplot2 (v2.2.1).

Results

DNA methylation profiling identifies five distinct subtypes of PFB ependymoma

Genome-wide methylation profiling was available on 212 primary posterior fossa ependymomas (76 frozen and 136 formalin-fixed paraffin-embedded samples) previously assigned to the PFB subgroup using either the Illumina HumanMethylation450 (n = 180) or HumanMethylationEPIC arrays (n = 32) and the Heidelberg brain tumor classifier (https://www.molecularneuropathology.org/mnp) [2].

To discern the extent of heterogeneity within the PFB subgroup, we applied unsupervised clustering using the most variably methylated probes with a standard deviation over 0.25. A distance matrix was constructed using 1 minus the weighted Pearson correlation, and both t-SNE analysis and spectral clustering were applied to the resulting distance matrix (Fig. 1a, b). Using the t-SNE analysis performing 2000 iterations, five distinct clusters were identified which we term subtypes. Spectral clustering was also applied for k = 2 to k = 8, confirming the optimal clusters at k = 5 (Fig. 1b, Supplemental Fig. 1). The subtypes obtained by t-SNE and the spectral clustering were also largely identical when the t-SNE analysis is compared to spectral clustering at k = 5 (Supplemental Fig. 2A). No major differences in overall methylation levels at probes near the promoter were identified across the five subtypes (Supplemental Fig. 2B). In four instances, paired samples from diagnosis and relapse were available, and we did not observe any switching of subtype at recurrence (1 PFB1, 2 PFB2, and 1 PFB3).

Fig. 1
figure 1

Clustering of PFB reveals five distinct subtypes. a t-SNE plot showing the relative distribution of the five subtypes. b Heatmap obtained from spectral clustering (SpC), whereby yellow represents more similar samples and red represents disparate ones. Color bars at the top indicate the t-SNE cluster at k = 5 and the spectral clustering clusters (SpC) at k = 5; c t-SNE plot showing PFB clustered concomitantly with PF-SE and PFA. d Heatmap representing the expression levels of the ten most differentially expressed genes per subtype. Each column represents one sample, and each lane represents one gene. Gene expression levels are represented by a color scale as indicated

To determine if any of the PFB subtypes were closer to PF-SE subependymoma, particularly the ones with older age, we clustered the 212 PFBs with 34 posterior fossa subependymomas (PF-EPN-SP) and 220 PFAs. The five PFB subtypes clustered together, without any overlap with PF-SE subependymomas or PFAs, suggesting that the heterogeneity within PFB is distinct from the other two groups (Fig. 1c).

Differential gene expression across PFB subtypes

Array-based gene expression analyses of the five PFB subtypes (n = 25) reveal the top upregulated genes per subtypes (Fig. 1d), Differential expression analysis identifies in total 693 genes as significantly differentially expressed (adj. p < 0.05) between one versus the other four subtypes. Similar to the methylation-based clustering, PFB4 and PFB5 had significantly more differentially expressed genes compared to PFB1, 2, and 3 which exhibited fewer differences. Subsets of these genes, which included transcription factors, developmental patterning pathways, and potential drug targets, showed upregulated expression within the five subtypes. Gene-ranked pathway enrichment analysis of subtype-specific genes indicated several general and specific developmental processes and developmental patterning pathways identified as deregulated processes, specifically within PFB4 and PFB5 (Supplemental Fig. 3) [18].

Demographic differences across PFB subtypes

Annotation of demographics of the five subtypes reveals significant age differences between the five subtypes, specifically, an enrichment of younger patients in PFB4 (median 15 years, IQR 10.1–23.5 years) and enrichment of older patients in PFB5 (median 40.6, IQR 28.5–50 years, Fig. 2a p = 0.011). PFB1, PFB2, and PFB3 had similar median age at diagnosis (PFB1 25.9 years, IQR 13.4–42 years; PFB2: median 26 years, IQR 17–41 years; PFB3: median 29.1 years, IQR 13.2–41 years, Fig. 2a). Moreover, there are significant gender differences, wherein PFB2 and PFB4 have a high male bias, whereas PFB3 and PFB5 have a high female patient bias (Fig. 2b, p = 0.04).

Fig. 2
figure 2

Clinical characteristics across PFB subtypes. a Boxplot of age at diagnosis across PFB subtypes. b Frequency of male and female gender within the five subtypes. p values for age determined using the Kruskal–Wallis test and frequency of gender using the Fisher’s exact test

PFB subtype-specific chromosomal aberrations

We generated genome-wide DNA copy-number profiles, using the combined intensity of the methylated and unmethylated probes as previously described [5]. Many samples displayed aneuploidy without any recurrent amplifications or deletions, consistent with the previous descriptions of PFB ependymoma [26]. Moreover, the overall patterns of broad copy-number changes were similar to the previous reports, including enrichment of 1q gain (12%), monosomy 6 (61.3%), monosomy 10 (38.7%), monosomy 17 (33.5%), trisomy 5 (31.1%), trisomy 8 (23.5%), trisomy 18 (51.9%), and 22q loss (48.1%) [14]. However, when we repeated this analysis across the PFB subtypes, we observed several subtype-specific aberrations like monosomy 2 in PFB2 (PFB1 17/75, PFB2 18/36, PFB3 10/54, PFB4 4/30, PFB5 1/17, p = 0.0014), monosomy 3 in PFB1 (PFB1 38/75, PFB2 5/36, PFB3 11/54, PFB4 4/30, PFB5 2/17, p = 3.8 × 10−5), monosomy 6 in PFB1,2,3 (PFB1 51/75, PFB2 26/36, PFB3 41/54, PFB4 8/30, PFB5 4/17, p = 1.2 × 10−6), monosomy 8 in PFB3 (PFB1 1/75, PFB2 5/36, PFB3 25/54, PFB4 2/30, PFB5 1/17, p = 1.3 × 10−11), gain of chromosome 11 in PFB4 (PFB1 17/75, PFB2 13/36, PFB3 14/54, PFB4 21/30, PFB5 0/17, p = 4.4 × 10−6), and enrichment of 1q gain (PFB1 19/75, PFB2 5/36, PFB3 1/54, PFB4 0/30, PFB5 1/17, p = 1.6 × 10−4) and monosomy 17 (PFB1 44/75, PFB2 10/36, PFB3 9/54, PFB4 5/30, PFB5 3/17, p = 1.6 × 10−6) within PFB1 (Fig. 3a, b). Copy-number solutions for k = 3 and 4 suggested that PFB1, 2, and 3 were more similar groups overall, but, at k = 5, certain broad events were further enriched, like monosomy 2 in PFB2, monosomy 8 in PFB3, monosomy 3 in PFB1, and gain of 1q in PFB1 (Supplemental Fig. 3). T-SNE analysis and NMF clustering of broad copy-number aberrations could not recapitulate any of the observed subtypes (Supplemental Fig. 4).

Fig. 3
figure 3

Arm-level cytogenetic events across PFB subtypes. a Frequency and significance of arm-level gains and losses across the five PFB subtypes. Darker bars show significant arm-level events (q value ≤ 0.1, Chi-squared test). b Frequency and significance of whole chromosome gain and losses across the five PFB subtypes. Darker bars show significant arm-level events (q value ≤ 0.1, Chi-squared test)

Extent of surgical resection is the strongest predictor of progression-free survival across PFB ependymoma

Survival across the five PFB groups was determined using a Kaplan–Meier survival analysis, and no significant difference in overall or progression-free survival was observed between the five subtypes (Supplemental Fig. 5A, B). Progression-free survival was available on 137 cases, and progression events were evenly distributed across the five subtypes. Late relapses were common in this cohort, whereby 13 of the 36 relapses occurred after 5 years and five relapses even after 10 years, highlighting the importance of long-term follow-up in this group. Late relapses were evenly distributed across the five subtypes. Sixteen deaths were observed in the cohort, nine events in PFB1, and five events in PFB3, with one death in each of PFB2 and PFB4, and no deaths observed in PFB5.

The rates of incomplete resection and upfront adjuvant radiation were not significantly different across the five subtypes (Supplemental Fig. 6C, D). Consistent with our previous observations, in this cohort overall, a subtotal resection portended to a worse progression-free survival in a univariable analysis (STR: HR 3.01 95% CI 1.51–5.98 p = 1.7 × 10−3). Interestingly in this cohort, no upfront radiation and 1q gain were not significant predictors of poor outcome in a univariable analysis, although both have high hazard ratios, suggesting that they may have a trend to a worse outcome (no upfront radiation: HR 1.76 95% CI 0.83–3.75 p = 0.15; 1q gain: 1.84 95% CI 0.76–4.48 p = 0.18) (Supplemental Table 2).

To discern if other copy-number abnormalities confer any prognostic relevance, an exploratory univariable analysis was undertaken, whereby survival was assessed across all copy-number aberrations with at least a 10% incidence. Surprisingly, loss of 13q was significant in a univariable analysis (Fig. 4, log-rank p = 0.01); a univariable analysis of 13q loss compared to both balanced 13q and 13q gain confirmed a significant effect size (HR 2.66, 95% CI 1.262–5.591, p = 0.01). A multivariable analysis was then undertaken incorporating 13q and extent of resection, and the prognostic value of 13q loss remains significant (HR 2.73, 95% CI 1.18–5.32 p = 0.017). To further determine the relationship between 13q loss and extent of resection can potentially predict progression-free survival, a Kaplan–Meier analysis was performed comparing 13q balanced/gain versus 13q loss stratified by extent of resection, which suggests that subtotal resected 13q loss tumors may represent a novel high-risk group of PFB ependymoma. Loss of 13q was not significant in predicting overall survival (HR 0.92, 95% CI 0.20–4.20, p = 0.92).

Fig. 4
figure 4

Progression-free survival stratified by a 1q gain and b 13q status. p values determined using the log-rank test

To determine if extent of surgical resection and chromosome 13q loss remained prognostic when correcting for PFB subtypes, age, gender, 1q gain, and upfront radiation, a multivariable analysis of progression-free survival was performed (Table 1). Both an incomplete resection and 13q loss remained significant, when correcting for co-variants. Interestingly, upfront radiation and 1q gain were marginally significant in the multivariable model; however, a significant interaction factor could not be discerned between a subtotal resection, upfront radiation, and 1q gain. Only an incomplete resection was a risk factor for worse overall survival (Table 1). Overall outcome was excellent across the whole cohort with only 16 deaths across 142 patients, and none of the variables above were predictive of death, although longer follow-up is likely required.

Table 1 Multivariable cox proportional hazards model of survival across PFB ependymoma

Discussion

Across the largest cohort of PFB ependymoma assembled to date, we show that PFB is comprised of five distinct molecular subtypes. These five subtypes display distinct demographics and copy-number profiles, suggesting that there exists a significant molecular heterogeneity within PFB ependymoma. We also show that none of the five subtypes of PFB ependymoma overlap with either subependymoma or any of the PFA ependymoma subtypes, further strengthening the existence of three major molecular subgroups of posterior fossa ependymoma.

PFB ependymoma has been considered for de-escalation of therapy due to their excellent overall survival [12]. Previously, we have shown that 50% of gross totally resected PFB ependymoma can be cured with surgery alone, suggesting that this group could potentially be spared the long-term side effects of radiation and/or chemotherapy [15]. Our current data suggest that even when correcting for heterogeneity within PFB ependymoma, this group can still benefit from de-escalation of therapy. However, the observation that late relapses are common in PFB highlights that long-term follow-up is required for any eventual trial of de-escalation of therapy. Overall survival in the cohort was excellent, suggesting that successful salvage is possible; however, the observation that deaths were enriched in PFB1 and 3 requires further evaluation in future cohorts.

Several studies over the past 15 years have suggested 1q gain to be a marker of poor prognosis across all ependymomas as a whole [7, 8]. However, when 1q is evaluated in a subgroup-specific manner, it is only prognostic for progression-free and overall survival in PFA, with no prognostic relevance in the RELA subgroup of supratentorial ependymoma [14]. Indeed, currently, 1q-gained PFA are being prioritized urgently for new and novel upfront approaches. Our previous work suggested that 1q gain was prognostic for progression-free survival but not overall survival within PFB [14]. Our current results in a much larger cohort suggest that there may be only limited utility to 1q gain as a prognostic marker specifically for selection of patients who would benefit from radiation sparing strategies, and this requires further evaluation in prospective cohorts. However, the current results are consistent with our previous observations that the overall survival is not significantly different in PFB patients harboring 1q-gained tumors. The relatively high hazard ratio for 1q gain suggests that its role as a prognostic marker may warrant further evaluation. However, for the next generation of clinical trials within posterior fossa ependymoma, 1q should be used for stratification only within PFA, and highlights the importance of molecularly informed studies.

We show, for the first time, that identification of chromosome 13q loss may represent a more reliable prognostic marker across PFB ependymoma than 1q gain. Indeed, when stratifying chromosome 13q loss by extent of resection, gross totally resected PFB ependymoma with 13q loss appears to still have a poor outcome. The absence of other large cohorts of PFB ependymoma precludes its independent validation; however, it does warrant further study particularly in the context of ongoing and upcoming prospective cooperative group studies.

Most importantly, however, the identification of multiple subtypes of PFB, analogous to recent work across PFA ependymoma, shows clearly that there exists a significant biological heterogeneity within this group [13]. The current cohort is significantly limited by a lack of sufficient gene expression data, but the available data support the heterogeneity within PFB as determined by methylation profiling. Prospective collection of frozen tissue particularly in adults with posterior fossa ependymoma will be crucial to determine which specific biological processes underlie the observed subtypes. As such, our identification of five subtypes provides strong support for the prospective collection of tissue in this group for further next-generation integrated analysis. Moreover, the observation that late relapses and deaths are not infrequent in this group highlights a need to identify and validate actionable pathways. Indeed, our identification of multiple subtypes of PFB has profound implications for future targeted therapies, as any “PFB” specific therapies may be restricted to specific subtypes. This is especially important in the context of the observed subtype-specific copy-number aberrations which suggest that PFB tumors do not emerge from a single driving event. As of today, zero recurrent mutations have been found in PFB ependymoma and a clear driver for these tumors is still lacking [10]. By focusing on specific PFB subtypes and by making use of additional molecular data like large gene expression profiling and more comprehensive next-generation-sequencing enriching for each subtype, future studies may be able to elucidate the underlying events that ultimately lead to the development of these tumors [10].

Further delineating the biology and prognostic markers of PFB ependymoma will require extensive collaboration across adult and pediatric centres, as currently adult ependymoma is an often-neglected entity, with a paucity of frozen tissue available. However, future basic science and translational science efforts on PFB should account for the heterogeneity within PFB ependymoma in the development of more targeted therapeutics.