Introduction

Medulloblastoma is the most common malignant pediatric brain tumor [3]. Despite multimodal treatments of maximal safe surgical resection, radiation, and chemotherapy, there remains a significant portion of patients who succumb to their disease [20]. Recent integrative genomics have identified four distinct subgroups of medulloblastoma, these include WNT, SHH, Group 3 and Group 4 [2, 13, 16, 27, 28]. These four subgroups have disparate demographics, clinical features, and genetics. Previous work demonstrates that clinical parameters used to risk stratify patients are largely attributed to molecular subgroup differences. For example, WNT patients have the best prognosis, whereas Group 3 patients often present with metastatic disease and have the worst prognosis [12, 15, 21]. As patient mortality and high-risk disease are characterized by the presence of metastatic lesions, there is significant interest in unraveling the role of subgroup affiliation between the primary and metastatic compartments.

Previous study comparing primary and recurrent medulloblastoma has demonstrated the maintenance of subgroup affiliation at recurrence, using a 22-gene nanoString probe-set [22]. This finding largely deviates from other neoplasms, such as glioblastoma multiforme, where molecular subclass switching has been identified, both temporally and spatially [10, 18, 25]. What remains unknown is whether medulloblastoma maintains subgroup identity between the primary and metastatic compartment. As inclusion/exclusion schemas for many clinical trials already necessitate molecular subtyping, the establishment of molecular subgroup in both the primary and metastatic compartments remains of critical importance [30]. Whether molecular subgroups play a significant prognostic and biological role in the metastatic compartment remains to be seen. Moreover, future trials will likely evaluate patients with relapsed/recurrent and metastatic disease, highlighting the need to identify molecular subgroup identity in both the primary and metastatic disease.

Methodology

Patients

Our integrative molecular and clinical analysis comprised of two non-overlapping cohorts. Cohort 1 (discovery) consisted of all patients with metastatic medulloblastoma with either frozen or formalin-fixed paraffin-embedded (FFPE) material along with clinical variables and survival data from 10 different centers (Johns Hopkins University School of Medicine, Baltimore, MD, USA; Virginia Commonwealth University, Richmond, VA, USA; New York University Langone Medical Center, New York, NY, USA; Children’s Hospital of Minnesota, Minneapolis, MN, USA; Stanford University School of Medicine, Stanford, CA, USA; Emory University, Atlanta, GA, USA; Texas Children’s Cancer Center, Houston, TX, USA; Weill Medical College of Cornell University, New York, NY, USA; Brain Tumour Tissue Bank, London, ON, Canada; Hospital for Sick Children, Toronto, ON, Canada). Matched samples from primary and metastatic samples were extracted using TRIzol RNA extraction, according to manufacturer’s instructions. Cohort 2 (validation) consisted of samples from patients with metastatic medulloblastoma obtained at the NN Burdenko Neurosurgical Institute (Moscow, Russia). For all available clinical characteristics of both patient cohorts, including the location of available metastases, please see Supplementary Table 2.

Subgroup determination was established using gene expression profiling, nanoString-targeted gene expression profiling, as well as 450k DNA methylation, as previously described, in all cases, where available, from cohort 1 [7, 14, 16]. Subgroup affiliation for cohort 2 was completed by immunohistochemistry employing the four-antibody approach, as previously described (WNT = nuclear β-catenin, SHH = SFRP1, Group 3 = NPR3, Group 4 = KCNA1) [16, 23, 24]. For SFRP1 and NPR3, we detected membranous–cytoplasmic staining and most of the tumor cells were stained with these markers. For KCNA1, we detected cytoplasmic and nuclear staining with wide extensions in the Group 4 tumors.

The research ethics boards at all participating centers approved the study and all samples and clinical information were obtained with consent in accordance with the research ethics board at the Hospital for Sick Children and collaborating centers.

Statistical analysis

Whole genome expression was generated using the Affymetrix GeneChip Human Gene 2.0 ST Array. Samples were normalized using RMA as part of the R/Bioconductor oligo package (version 1.26.6) [8]. DNA methylation was generated using the Illumina Infinium HumanMethylation450 BeadChip array (450k array). Samples were normalized using the SWAN as part of the R/Bioconductor minfi package (version 1.12.0). Assessment of differential expression between primary and metastatic samples was conducted using the generalized linear model with empirical Bayes adjustment using the limma package from R (version 3.0.2). Unsupervised hierarchical clustering (HCL) using the Pearson correlation metric and non-negative matrix factorization (NMF) consensus analysis for whole genome expression and DNA methylation were completed using the top 1,000 differentially expressed genes and top 10,000 differentially methylated probes, respectively. We used the cophenetic coefficient as a measure of correlation between the sample distances induced by the consensus matrix [1]. The red circle is the evidence for the number of clusters resulting in the highest similarity between samples. Principle component analysis was done in the Partek Genomic Suite and HCL and NMF was done using MultiExperiment Viewer (version 10.2). Class prediction was done using prediction analysis of microarrays (PAM) as previously described [29], using the expression training data as reported by Northcott et al. [16]. (Gene Expression Omnibus accession No. GSE 21140) and methylation training data as reported by Hovestadt et al. [6]. (Gene Expression Omnibus accession No. GSE 54880). Raw and normalized whole genome expression and 450k DNA methylation data were deposited to Gene Expression Omnibus under accession number GSE 63670.

Results

Cohort description

Biopsies of metastatic lesions of medulloblastoma are not routinely taken; as such very few primary-metastatic pairs have been analyzed. We set out and collected a relatively large cohort of primary-metastatic pairs to our knowledge and performed integrative genetic analysis to determine subgroup affiliation. Table 1 shows the demographics of all patients in this study. Due to limitation and rarity of patient samples with matched primary and metastasis, 9 patient samples were subjected to gene expression profiling and 11 patient samples were profiled using high-resolution genome-wide methylation arrays. Eight out of the 12 patients have both gene expression and 450k DNA methylation data; this cohort of patients will thus be referred to as the discovery cohort. We have also conducted immunohistochemistry on a non-overlapping cohort of patient samples obtained from the Burdenko Neurosurgical Institute; this cohort of patients will be referred to as the validation cohort. Both the discovery and validation cohort have similar age, with the vast majority of patients between the ages of 5 and 18. The cohorts are comparable in terms of gender and histology. Using a previously validated 22-gene nanoString probe-set for subgroup determination [14], the most enriched subgroup is Group 4, followed by Group 3 (Fig. 1a). We did not have any WNT patients, which is likely a reflection of the largely local and non-metastatic nature of these tumors. Using an established cohort of 103 patients with known subgroup affiliation as the training set, we further used prediction analysis of microarrays (PAM) prediction to assign subgroup to the primary and metastases pairs (Supplementary Table 1).

Table 1 Clinical characteristics of matched primary and metastatic medulloblastoma in the discovery and validation cohorts
Fig. 1
figure 1

a Heat map of relative gene expression of 22-gene nanoString probe-set (normalized with ACTB, GAPDH, LDHA) on 17 samples (6 matched primary-metastasis patients). b Unsupervised hierarchical clustering of human 2.0 exon array (Affymetrix GeneChip Human Gene 2.0 ST Array) expression data from 22 medulloblastoma samples (9 matched primary-metastasis patients) using 1,000 most differentially expressed genes. c Non-negative matrix factorization (NMF) consensus analysis provides strong statistical support for three subgroups (k = 2, cophenetic coefficient = 0.86; k = 3, cophenetic coefficient = 0.87; k = 4, cophenetic coefficient = 0.77). d Principle component analysis (PCA) of the primary and metastatic medulloblastoma samples described in (b) using the same 1,000 most differentially expressed genes. Colored ellipsoids (red SHH, yellow Group 3, green Group 4) represent 1.5 SDs of the data distribution for each subgroup. Individual primary samples are indicated with magenta color and metastatic samples are indicated with purple color

Subgroup stability by expression

Using gene expression signatures (Affymetrix GeneChip Human Gene 2.0 ST Array) from 9 pairs of primary-metastasis pairs, we show the subgroup affiliation is stable between the primary and metastatic compartment. Unsupervised hierarchical clustering using the top 1,000 differentially expressed probes is able to recapitulate the subgroups despite the low sample number. In all 9 pairs, the primary and metastatic samples clustered with the same subgroup and furthermore clustered with the same patient, even in cases with multiple metastases (Fig. 1b). We further demonstrate using NMF consensus clustering that in all but one case (patient 4), primary and metastatic samples are more alike to each other, with the highest support for 3 subgroups (k = 3, cophenetic coefficient = 0.87) (Fig. 1c). The similarity and stability of subgroup between the primary and metastatic compartment were also demonstrated using principal components analysis (PCA) (Fig. 1d). The primary (pink) consistently clusters with the matched metastasis (purple). Individual patients also cluster more closely together to each other (Supplementary Fig. 1a). Using three orthogonal methods, we demonstrate that primary and metastasis from the same patients cluster together.

Subgroup stability by methylation

To further demonstrate the subgroup stability between primary and metastasis, we performed Illumina 450k DNA methylation array (Infinium HumanMethylation450 BeadChip) on 11 patient pairs. Unsupervised hierarchical clustering using the top 10,000 most differentially methylated probes as calculated by the Kruskal–Wallis test, demonstrates maintenance of subgroup between primary and metastatic pairs. In all cases, the primary (pink) clustered together with the metastases (purple) (Fig. 2a). NMF consensus analysis further provides statistical support for the three medulloblastoma subgroups that remain stable between patient pairs (k = 3, cophenetic coefficient = 1.0) (Fig. 2b). Using principal component analysis, the methylation of the primary and metastatic samples cluster together within subgroups, indicating a strong degree of stability between patients in the same subgroup (Fig. 2c). Furthermore, different samples from the same patient within each subgroup are more alike to each other, as indicated by the overlap of individual coloured spheres (Supplementary Fig. 1b). Using a publically available dataset of 100 primary medulloblastoma samples with subgroup affiliation as determined through 450k DNA methylation array, we further validated the stability of subgroup between primary and metastases using PAM prediction (Supplementary Table 1). Using integrative genetic analysis looking at gene expression signatures and 450k DNA methylation, we demonstrate the maintenance and stability of medulloblastoma subgroups between the primary and metastatic compartments. Using an orthogonal technique of immunohistochemistry on a non-overlapping cohort of 19 primary and metastases patient pairs, we further validated the maintenance of subgroup affiliation between primary and metastatic compartments (Fig. 3a). Supplementary Table 1 shows a summary of the subgroup affiliations using different platforms and statistical tests. We observed a total of 4/28 misclassified samples using 3 different strategies comprising of both gene expression and DNA methylation data for subgrouping totaling 168 tests, thus comprising only a very small disconcordance rate (2.98 %). Currently the gold standard is considered consensus clustering using Illumina Infinium HumanMethylation450 arrays. Using consensus clustering by high-density methylation arrays, the primary and metastatic samples uniformly share subgroup affiliation. We therefore conclude, using multiple experimental approaches examining the levels of gene expression, DNA methylation, and protein expression, that medulloblastoma subgroups remain stable across both primary and metastatic compartments.

Fig. 2
figure 2

a Unsupervised hierarchical clustering of 450k DNA methylation (Infinium HumanMethylation450 BeadChip Kit) data from 27 medulloblastoma samples (11 matched primary-metastasis patients) using 10,000 most differentially methylated probes. b Non-negative matrix factorization (NMF) consensus analysis provides strong statistical support for three subgroups (k = 2, cophenetic coefficient = 1.0; k = 3, cophenetic coefficient = 1.0; k = 4, cophenetic coefficient = 0.85). c Principle component analysis (PCA) of the primary and metastatic medulloblastoma samples described in (a) using the same 10,000 most differentially methylated probes. Colored ellipsoids (red SHH, yellow Group 3, green Group 4) represent 1.5 SDs of the data distribution for each subgroup. Individual primary samples are indicated with magenta color and metastatic samples are indicated with purple color

Fig. 3
figure 3

a Immunohistochemistry of 19 matched primary-metastasis patient samples in our validation cohort (SHH = SFRP, Group3 = NPR3, Group4 = KCNA1) provides additional support using orthogonal technique the maintenance of molecular subgroups between primary and metastatic compartments

Discussion

Herein, we demonstrate that medulloblastoma subgroup affiliation remains stable in both the primary and metastatic compartments. Using a multimodal validation strategy integrating molecular—both gene expression and methylation analysis—and immunohistochemistry tools, we evaluated two non-overlapping cohorts of medulloblastoma. This study, to our knowledge, represents the largest study to date designed to evaluate matched primary and metastasis samples with detailed subgroup information. Metastatic and primary disease from the same subgroup will always cluster together, further highlighting their similarity, and strengthening the notion that medulloblastoma subgroups are distinct entities.

Our finding that subgroup affiliation is stable between the primary and metastatic compartments further reinforces the stability of medulloblastoma subgroups. Indeed, this finding further suggests that medulloblastoma subgroups arise from distinct cells of origin [5, 11, 19, 22]. The maintenance of subgroup affiliation between the two compartments reflects the primary and metastatic compartments sharing a distinct cell of origin. However, our previous work suggests that the metastatic compartment is distinct form of the primary. Clinically, Group 3 and 4 patients fail almost exclusively with metastatic dissemination suggesting a therapy-resistant subclone drives relapse [22]. This coupled with our previous cross-species genomic studies suggest that in both murine and human medulloblastoma, the primary and metastatic compartments are genomically distinct [31]. This current work suggests that although the cell of origin between the primary and metastatic compartments is retained, the two compartments are distinct within the context of a preserved subgroup affiliation.

It is of interest to note that despite subgroup affiliation being preserved between the primary and metastasis compartments, metastasis often clusters closer to each other than to their primary disease. Although this evidence is preliminary given our limited number of samples with multiple metastases, this finding suggests the intriguing possibility that clonal evolution has given rise to divergent populations in the metastatic compartment. Previous evidence from murine medulloblastoma indeed shows that the primary and metastatic compartments are biologically distinct and harbor different driver events [31]. This observation may have significant clinical implications; therapies aimed at targeting disease subgroups may be more efficacious than targeting single genetic aberrations, which may or may not be present in the metastatic compartment or at recurrence.

Treatment for metastatic medulloblastoma has led to survival rate approaching 70 % [4, 9, 17, 26]. However, the requirement for 36 Gy of craniospinal irradiation results in devastating neurocognitive sequelae. To further increase survival and improve quality of life, targeted therapies aimed at the metastatic compartment are urgently required. Future clinical trials, which are often conducted in the setting of metastatic or relapsed patients, need to prioritize on targets that are present in metastatic lesions. To better understand the metastatic compartment, sampling of the metastatic disease needs to be considered if possible. However, sampling for the sole purpose of subgrouping is unwarranted and based on the findings of this paper unnecessary and should rather be extrapolated from the primary disease. Prospective multicentered longitudinal studies of metastatic medulloblastoma need to be conducted in a subgroup-specific fashion to increase our understanding of metastatic progression. Further studies using high-resolution platforms, such as RNA sequencing and next generation whole genome sequencing comparing both primary and matched metastases will guide therapeutic development.