Primary central nervous system lymphoma (PCNSL), also referred to as primary diffuse large B-cell lymphoma (DLBCL) of the central nervous system (CNS) [5], manifests aggressive clinical behavior. The mere existence of a malignant lymphoma confined to an immunoprivileged organ devoid of classical secondary lymphoid structures raises important questions regarding the mechanisms of its development. One of the main issues is to establish whether PCNSL is simply a “DLBCL, not otherwise specified (NOS)”, which happened to develop in the CNS, or should better be considered a distinct lymphoma entity.

Genome-wide DNA sequencing of PCNSLs has shown that a high mutation frequency at two loci MYD88 L265 or CD79B Y196 plays a key role in constitutively up-regulating the B-cell receptor signaling/NF-κB pathway [1]. We hypothesized in this study that a DNA methylation profile may unravel the unique identity of PCNSL that may be distinct from systemic DLBCL.

A total of 100 PCNSLs and 4 metastatic DLBCL specimens in the CNS were subjected to Illumina Infinium HumanMethylation 450K BeadChip (450k array) methylation analysis (Supplementary fig. S1, Supplementary table S1). Methylation and clinical data for a total of 96 systemic DLBCLs were also obtained from the Gene Enrichment Omnibus (GEO) database and The Cancer Genome Atlas (TCGA) (Supplementary table S2–3). The potential involvement of the Epstein-Barr virus (EBV) was examined using EBER in situ hybridization and EBNA2 by nested-PCR (see Supplementary materials, Supplementary figure S2a, Supplementary table S4). Gene Ontology (GO) term enrichment analysis was conducted using DAVID Bioinformatics Resource for probes differentially methylated in PCNSL (Supplementary materials, Supplementary table S2). In copy number aberration (CNA) analysis, we focused on reported recurrent CNAs in PCNSL or systemic DLBCL using the 450k array data as published (Supplementary materials). More detailed information is available in Supplementary materials.

A total of 95 PCNSL samples, 2 metastatic DLBCLs in CNS (primary DLBCLs, NOS) and 73 systemic DLBCLs fulfilled the criteria and were enrolled for further analysis (Supplementary fig. S2b, S2c, S3–4, Supplementary table S1, S3). The results showed that, in comparison to systemic DLBCL, a significantly larger number of CpG sites within the CpG island (CGI) were methylated in PCNSL (Fig. 1a). There were 857 genes for which the TSS200 probes (probes within 200 base-pair of the transcription start site in CGI) were hypermethylated in PCNSL (adjusted p value <10−5 and difference by 0.10 in methylation β value), in contrast to 217 genes in systemic DLBCL (Supplementary table S5–6). Consensus clustering for PCNSL, metastatic DLBCL in CNS and systemic DLBCL using the most variant CGI probes (SD >0.22) within systemic DLBCL robustly identified three distinct DNA methylation clusters, which were named as “lymph nodes like”, “moderately methylated” and “highly methylated” (Fig. 1b, Supplementary fig. S5a–c). The most striking finding was that PCNSLs and systemic DLBCLs were clearly separated into different clusters (p < 0.0001; Fisher’s exact test). A number of variant probes were highly methylated in PCNSL while they were unmethylated in normal lymph nodes (PS1 and PS3 in Fig. 1b). On the other hand, a set of variant probes highly methylated in normal lymph nodes (PS2 in Fig. 1b) was hypomethylated in PCNSL, although they remained hypermethylated in systemic DLBCLs. Interestingly, the 2 metastatic DLBCLs in CNS also showed the hypomethylated pattern of PS2 as in PCNSL (Fig. 1b). The status of the mutation hotspots MYD88 L265 and CD79B Y196 in PCNSLs was not associated with the methylation subclasses. No mutations were found in these mutation hotspots for the 2 metastatic DLBCLs in CNS, confirming the findings of a previous report [2]. Subtyping of GCB and non-GCB in PCNSL was not correlated with methylation subgroups (Supplementary fig. S6). The extent of methylation deviation from the lymph nodes was larger in PCNSLs compared with that in systemic DLBCLs (p value <0.0001; Wilcoxon rank-sum test) (Supplementary fig. S7).

Fig. 1
figure 1

a A volcano plot for DNA methylation showing differentially methylated CpG island probes between PCNSL and systemic DLBCL (P value: Wilcoxon rank-sum test FDR-corrected). Note that the cluster of significantly highly methylated (FDR-corrected p value <10−5, ∆β > 0.10) probes is present mostly in PCNSLs, showing a hypermethylated phenotype in PCNSL as compared to systemic DLBCL. b Unsupervised 2-way hierarchical clustering of DNA methylation and heat map for the full tumor cohort (Supplementary figure S1) based on the most variable methylation probes (SD > 0.22) limited to systemic DLBCLs. The probes are located within the CpG islands and those on sex chromosomes or brain specifically methylated/unmethylated probes compared with lymph nodes have been excluded. The probe set 1 (PS1) and the probe set 3 (PS3) showed low methylation in lymph nodes, while they are highly methylated in PCNSLs and systemic DLBCLs. The probe set 2 (PS2) were highly methylated in lymph nodes and systemic DLBCL, while hypomethylated in PCNSL. A summary of clinical features, as well as the MYD88 L265 and CD79B Y196 genomic status of PCNSLs and metastatic DLBCLs in CNS are shown in the columns below the heat map. The tumor cell content is shown in the bottom panel

As a generally observed epigenetic phenomenon, CpG sites in unexpressed genes tend to acquire methylation during the development of cancers [6]. For the purpose of identifying those “passively” methylated CpG sites (presumably non-pathogenic as expression would not be affected) in PCNSLs, we examined gene expression in lymph nodes (normal mature B-cell tissues) and excluded probes associated with unexpressed genes in a gene ontology analysis (Supplementary materials, Supplementary table S7). We then investigated which biological functions were over-represented in the genes that are differentially methylated between PCNSLs and systemic DLBCLs. It appeared that a subset of PCNSLs was characterized by hypermethylation of genes involved in several important pathways, such as biological adhesion and cell adhesion (Supplementary table S8). On the other hand, biological processes related to glycosylation were highly significantly represented in unmethylated probe sets in PCNSLs (Supplementary table S9).

PCNSLs and systemic DLBCLs are histopathologically indistinguishable. Overexpression of BCL6 together with the positive expression of cell surface IgM, as well as evidence of aberrant somatic hypermutations, strongly suggests that PCNSL cells may have been arrested at a mature B-cell differentiation stage. Considering that physiological somatic hypermutation of lymphocytes requires the microenvironment of secondary lymphoid organs, it appears that the initial genetic events of PCNSL occurred in these organs rather than in the CNS. In support of this, Fukumura et al. identified MYD88 L265P mutations at a low frequency in peripheral blood mononuclear cells of PCNSL patients [1]. These DNA are unlikely to be derived from the PCNSL cells that had infiltrated into the systemic circulation, because other mutations with even higher allelic frequency than MYD88 present in tumors were not detected. These findings are consistent with the previous observation that tumor-associated rearrangements of immunoglobulin variable genes were clonally present in secondary lymphoid structures of PCNSL patients [3]. These data strongly suggest that pre-lymphoma cells first appear outside of the CNS and circulate in the peripheral blood and possibly in lymphatic vessels before they migrate into the CNS.

Our methylation analysis demonstrated that PCNSLs have a distinct DNA methylation profile compared to that of systemic DLBCLs or normal lymph nodes, while a subset of systemic DLBCL share a similar methylation profile with PCNSL (Fig. 1b). Taken together, our results suggest that PCNSL may have developed from a subset of systemic DLBCLs that acquired DNA hypermethylation, possibly resulting in a higher affinity for the CNS. We therefore propose that PCNSL originates from mature B cells residing outside the CNS and forms a distinct entity from systemic DLBCL, as PCNSL shows a wider methylation divergence from mature B cells than most systemic DLBCL, which presumably provides a growth advantage in this environment. Our hypothesis, however, requires further validation in an independent cohort.

In support of this hypothesis, genes involved in cell adhesion were significantly over-represented among the hypermethylated genes in PCNSLs as compared to systemic DLBCLs, which would result in different phenotypic characteristics of cell–cell or cell–endothelial tissue attachment between PCNSLs and DLBCLs. As activation of cell adhesion molecules may function as a trigger to tumor cell invasion [4], together with glycosylation being over-represented among the unmethylated gene sets in PCNSLs, combining these alterations associated with cell surface molecules may represent a key molecular signature for the biology of PCNSL, which could potentially lead to the development of a novel targeted therapy against these tumors.

Our study, thus, provides the first molecular evidence that PCNSL belongs to a biologically distinct entity from systemic DLBCLs. These findings will hopefully provide new scope to develop alternative therapeutic strategy for PCNSL.