Introduction

The outcome of patients with primary CNS lymphoma (PCNSL) has greatly improved: With high-dose methotrexate (HD-MTX)-based chemotherapy as backbone of all efficient therapies, survival has increased and, at least in younger patients, long-term survival can be achieved in a significant proportion of patients (Korfel and Schlegel 2013). It remains a major point of discussion whether whole brain radiotherapy (WBRT) should be part of primary therapy. The G-PCNSL-SG-1 non-inferiority trial addressed this question by randomizing WBRT as compulsory part of first-line therapy (early WBRT) against its omission from first-line therapy (no early WBRT; Fig. 1). As the main result, G-PCNSL-SG1 did not find significant differences in overall survival (OS) between the two arms although the non-inferiority of the no early WBRT arm could not be formally proven (Thiel et al. 2010; Korfel et al. 2015).

Fig. 1
figure 1

Design of the G-PCNSL-SG-1 trial. WBRT whole brain radiotherapy (45 Gy), HD-MTX high-dose methotrexate, CR complete response, PR partial response, SD stable disease, PD progressive disease, HD-AraC high-dose cytarabine. (Asterisk) Since 08/2006 combined with ifosfamide 1.5 g/m2 daily, d3–5

The application of WBRT in combination with HD-MTX-based chemotherapy was associated with late neurotoxicity including progressive cognitive decline in a substantial proportion of patients, particularly the elderly in non-comparative trials (Correa et al. 2004; Harder et al. 2004; Herrlinger et al. 2005; Correa et al. 2012). G-PCNSL-SG1 was the first randomized trial prospectively collecting and evaluating data of QoL and cognition.

Materials and methods

Patient population, randomization and treatment

From 2000 to 2009, G-PCNSL-SG-1 recruited adult immunocompetent patients with newly diagnosed, histologically confirmed PCNSL. A Karnofsky performance score (KPS) of less than 50 unrelated to PCNSL or less than 30 due to PCNSL was an exclusion criteria. Patients were randomized 1:1 for HD-MTX-based chemotherapy + WBRT vs. HD-MTX-based chemotherapy without primary WBRT (Fig. 1). The comprehensive list of all inclusion/exclusion criteria and details on randomization have been published (Thiel et al. 2010). The trial (ClinicalTrials.gov number NCT00153530) was approved by all local ethics committees; all patients gave their written informed consent.

Treatment included 6 biweekly courses of HD-MTX (4 g/m2, 4 h intravenous infusion). If glomerular filtration rate (GFR) was below 1.67 ml/s, MTX dose was reduced by the percentage that the GFR was below 1.67 ml/s. Patients with GFR <0.83 ml/s were excluded. After enrollment of 409/551 patients, all further patients received additional ifosfamide 1.5 g/m2 (daily, day 3–5). Therapeutic decisions after 6 courses were made according to the restaging results and the randomization result obtained at the time of study inclusion (Fig. 1): patients in complete response (CR) received either WBRT (45 Gy in 1.5 Gy daily fractions) as consolidation therapy (early WBRT arm) or were followed without any further primary therapy (no early WBRT arm); patients without CR received either WBRT (early WBRT arm) or high-dose cytarabine (no early WBRT arm; 2 × 3 g/m2 per day for 2 days every three weeks). All patients were to be followed with neurological examination, contrast-enhanced MRI, quality of life QoL assessments and Mini Mental State Examinations (MMSE) every 3 months (first year), 4 months (year 2) or 6 months (starting from year 3).

As shown in the previously published CONSORT statement (Thiel et al. 2010), 318 out of 551 patients randomized upfront received the treatment specified by randomization and were thus included in the per-protocol population. To most accurately define the effects of WBRT and to account for the substantial number of patients not receiving their assigned treatment, the present analysis of QoL and MMSE is restricted to the per-protocol population.

QoL and MMSE analysis

QoL was an exploratory secondary endpoint and was determined using the EORTC self-reporting questionnaires EORTC-QLQ-C30 and EORTC-QLQ-BN20 [8,9]. EORTC-QLQ-C30 includes 30 items regarding general aspects of QoL of cancer patients. EORTC-QLQ-BN20 adds 20 items addressing brain-specific questions. Responses were scored, analyzed, and transformed to a 0–100 scale (Fayers et al. 1999). For functional scores, a higher score is favorable, while for symptom scores higher scores indicating higher symptom burden are unfavorable. Although nowadays not regarded as the standard for neurocognitive testing when looking for neurotoxic sequelae of cancer therapies, the study protocol originating from the year 2000 specified that cognition was measured using the MMSE (Folstein et al. 1975). MMSE is a paper and pencil test summarizing different cognitive domains and providing an overall result on a 30 points scale.

The patients were asked to complete the QoL questionnaires and MMSE at baseline and subsequently at every clinical and imaging follow-up examination until death. Thus, QoL data were obtained irrespective of disease progression. To determine the development of QoL and MMSE over time, median values for QoL and MMSE scores were determined for the following time intervals: up to day 20 after randomization (baseline), day 21–365 (year 1), day 366–730 (year 2), day 731–1095 (year 3), and day 1096–1460 (year 4). If several data points per time interval were available for a particular score, the latest data point obtained was included in the analysis. Differences in QoL and MMSE between the arms in year 2 were analyzed for statistical significance using the Mann–Whitney U test.

Additionally, a linear mixed model analysis for the change of QoL and MMSE scores over time was performed. This analysis models the dependence of repeated measurements over time by including a random effect for the patient. This analysis was carried out irrespective of tumor progression, i.e., values obtained after tumor progression were also included. For each QoL dimension, a mean change/year in relation to pretherapeutic baseline levels was calculated. The development of the QoL scores in the two arms was compared to each other by testing for time*treatment interaction. Since this analysis was purely explorative, there was no correction for multiple testing.

Results

Patients’ characteristics and QoL functional scores

The details of patients’ characteristics in the per-protocol population used for analysis have already been reported (Thiel et al. 2010) and again summarized in Table 1. There were no apparent differences between the early WBRT and the no early WBRT arm regarding age, initial KPS, global health status and initial MMSE (Table 1) at baseline.

Table 1 Patients’ characteristics in patients evaluable for quality of life (item “global health status”) at baseline and in year 2

QoL: functional scores

QoL functional scores were evaluated for six dimensions: global health status, physical functioning, role functioning, emotional functioning, social functioning and cognitive functioning. Figure 2a (global health status, emotional functioning, social functioning and cognitive functioning) and supplementary Fig. 1a (physical functioning and role functioning) show the development of these functional scores in the two randomization arms over time. The percentage of surviving patients providing data for these evaluations was 45% (70 of 156 patients alive) in year 2, 38% (42 of 110 patients alive) in year 3 and 29% (20 of 68 patients alive) in year 4. In year 2 after randomization, cognitive functioning and global health status were reduced in early WBRT patients as opposed to patients without early WBRT. The patients evaluable for QoL in year 2 did not show evident differences between the two randomization arms regarding prognostically relevant parameters such as age distribution or KPS (Table 1). For both timepoints, baseline and year 2, OS of patients analyzable for QoL did not differ between the arms (Supplementary Fig. 2). In years 3 and 4 after randomization, median QoL scores in the early WBRT group remained below the median scores in the no early WBRT group (Fig. 2a) but, with a small data base and the majority of patients not reporting QoL data, differences may not be overestimated.

Fig. 2
figure 2

Time course of median scores and interquartile ranges (IQR) for EORTC-QLQ-C30 and BN20 dimensions and the Mini Mental State Examination (MMSE) in the “early whole brain radiotherapy (WBRT)” arm and the “no early WBRT” arm of the G-PCNSL-SG-1 trial. P values (Mann–Whitney U test) for comparisons between the arms in year 2 after randomization are inserted, *p < 0.05. a Selected functional scores: global health, cognitive functioning, social functioning and emotional functioning. Higher values mean high functional status with less impairment in quality of life. Further functional scores are shown in Supplementary Fig. 1a. b Selected symptom scores: communication deficit, fatigue, appetite loss and hair loss: Higher values mean a higher symptom burden with higher negative impact on quality of life. Further symptom scores are shown in Supplementary Fig. 1b. c Mini Mental State Examination (MMSE)

Median scores for emotional functioning and social functioning (Fig. 2a) were lower in the early WBRT group than in the no early WBRT group at all follow-up time points. The differences did not reach statistical significance. For physical functioning and role functioning (Supplementary Fig. 1a), no differences between the treatment groups were found.

QoL: symptom scores

Figure 2b (communication deficit, fatigue, appetite loss, hair loss) and supplementary Fig. 1b (diarrhea, nausea and vomiting, insomnia, constipation, dyspnea, financial difficulties, pain, future uncertainty, motor dysfunction, visual dysfunction, bladder control, drowsiness, headaches, itchy skin, seizures, weakness of legs) show the development of symptom scores in the two study arms in the years after randomization. Three of the twenty symptom scores in the EORTC questionnaires showed differences in favor of the no early WBRT group at 2 years after randomization: fatigue, appetite loss, hair loss (Fig. 2b). Borderline significance was found for differences in communication deficit (p = 0.07). In year 4, no differences between the treatment groups were found any longer regarding the communication deficit and hair loss score (Fig. 2b). As observed for the functional scores, there was a tendency for a lower symptom burden in the no early WBRT group in some symptom scores (Fig. 2b and Supplementary Fig. 1b).

Longitudinal mixed model analysis for functional and symptom scores

A linear mixed model analysis for mean changes of QoL over time in comparison to pretherapeutic baseline was performed for all 26 QoL dimensions of the EORTC questionnaires (Table 2). QoL was superior in the no early WBRT arm: differences in favor of the no early WBRT arm were found for 15 of 26 dimensions: physical functioning, role functioning, emotional functioning, cognitive functioning, social functioning, fatigue, pain, insomnia, appetite loss, future uncertainty, visual disorder, communication deficit, drowsiness, itchy skin and weakness of legs. In none of the 26 dimensions, a superiority of the early WBRT arm was seen.

Table 2 Mean change per year according to the mixed model longitudinal analysis (transformed scale of values between 0 and 100)

MMSE analysis

At baseline, there were no overt differences between the early WBRT and the no early WBRT group regarding age and initial KPS. In year 2 after randomization, global cognition as determined by MMSE was worse in the early WBRT group as opposed to the no early WBRT group (Fig. 2c, p = 0.002). In years 3 and 4, the median MMSE in the no early WBRT group remained higher than in the early WBRT. While in year 2, 45% (70/156) patients had a MMSE test result, the absolute number of evaluable patients dropped during year 3 [47/109 patients (43%)] and year 4 [23/68 patients alive (34%)]. The mixed model analysis showed that with no early WBRT the MMSE score slightly improved by 0.27/year on a 30-point scale while in the early WBRT group, the mean change per year was −0.06 (p = 0.13).

Discussion

G-PCNSL-SG-1 trial is the first PCNSL trial documenting a negative influence of early WBRT on QoL parameters in a prospective, randomized setting. Importantly, the QoL domains related to neuropsychological deficits are most affected in year 2 after randomization (irrespective of progression), cognitive functioning was apparently reduced and fatigue was obviously increased in the early WBRT arm. This finding is also supported by a linear mixed model analysis. The changes in these and many other QoL domains may be highly relevant for the daily life of these patients since in year 2 after randomization, global health/overall quality of life was also reduced in the early WBRT arm.

Subjective impairments in cognitive functioning documented in QoL analyses were mirrored by lower scores in the MMSE. The negative influence of WBRT on MMSE test scores may be underestimated since MMSE is relatively insensitive to the cluster of symptoms associated with late neurotoxicity, i.e., attention and concentration deficits (Meyers and Wefel 2003). This may contribute to the fact that median MMSE values found in G-PCNSL-SG-1 are relatively high between 26 and 29/30 achievable points (Fig. 2c). Nevertheless, clear differences between the treatment groups were evident in year 2 after randomization. The higher rates of cognitive dysfunction observed with early WBRT are consistent with previous non-randomized observational studies showing a high rate of cognitive deficits and/or impaired QoL after HD-MTX followed by WBRT (Correa et al. 2004; Harder et al. 2004; Herrlinger et al. 2005; Correa et al. 2012; Ekenel et al. 2008), while neither QoL impairment nor neuropsychological deficits were found in 21 long-term survivors treated with chemotherapy alone (Juergens et al. 2010). Also, a non-randomized analysis of cognitive functioning and QoL in PCNSL long-term survivors treated with our without WBRT suggests that the addition of WBRT increases the risk of neurotoxicity (Doolittle et al. 2013).

We are aware of the suboptimal data structure of our analysis, in particular caused by the low return rates of questionnaires. Nevertheless, to our knowledge, prospectively collected Qol data in a phase III trial evaluating the role of consolidating WBRT in primary therapy of PCNSL patients are not available elsewhere. Thus, the results presented here might still be helpful for determining whether WBRT has a place in primary treatment of PCNSL. The prolongation of progression-free survival induced by WBRT in the G-PCNSL-SG-1 trial may be an argument in favor of WBRT in primary therapy. However, the G-PCNSL-SG-1 trial also showed no differences in overall survival between the groups (Thiel et al. 2010; Korfel et al. 2015). Now, the present data show that some dimensions of quality of life and in particular the cognitive performance in the lifetime remaining are reduced in patients receiving consolidating WBRT. Thus, a beneficial effect of WBRT beyond PFS prolongation cannot be detected. One could argue that HD-MTX with or without ifosfamide applied here is not sufficient to induce an in-depth remission which only would allow additional WBRT to be truly consolidating and survival prolonging. Only with a more complex chemotherapy, e.g., including rituximab, high-dose cytarabine, thiotepa, and/or high-dose chemotherapy with stem cell rescue a remission could be achieved that allows WBRT to exert its full survival-prolonging potential (Ferreri et al. 2009; Illerhaus et al. 2006). So far, there are no data supporting such a theoretical concept in PCNSL and, therefore, it appears to be reasonable to omit neurotoxic WBRT in the form it was given in the G-PCNSL SG1 trial (45 Gy in 1.5 Gy fractions) from primary HD-MTX-based therapy on the basis of the G-PCNSL-SG-1 data. This is even more prudent since the combination of WBRT with a more intense chemotherapy than HD-MTX alone has the potential to be even more neurotoxic.

In the G-PCNSL-SG-1 trial as well as in published case series, WBRT was applied with a comparably high total dose of 45 Gy, but with decreased single fractions of 1.5 Gy (Thiel et al. 2010; Correa et al. 2012). One way to reduce the risk of neurotoxicity could be to use lower total doses. In a study on patients with brain metastases, cognitive decline was found already 4 months after WBRT with 30 Gy in 2.5 Gy single fractions (Chang et al. 2009). In contrast, another trial on brain metastases using WBRT with 30 Gy yielded no cognitive decline and loss of QoL (Sun et al. 2011). Further reduced total doses of 23.4 Gy do not appear to have a strong detrimental effect on neurocognition in PCNSL patients (Correa et al. 2009). However, there is also no proof for a positive effect on overall survival conferred by dose-decreased WBRT (Ekenel et al. 2008; Shah et al. 2007). Another way of reducing neurotoxicity might be the technique of hippocampal sparing (Gondi et al. 2010). The true benefit of this technique for PCNSL patients has still to be shown. In the light of a lack of OS prolongation by WBRT, the best way to reduce the rate of late neurotoxicity is, however, to delay WBRT until progression/relapse as done in the no early WBRT arm of the G-PCNSL-SG-1 trial. This concept is also supported by observations of Hottinger et al. (2007) showing that the rate of neurotoxicity is reduced if the time interval between HD-MTX and WBRT is longer than 6 months.

In summary, the data from the G-PCNSL-SG-1 trial do not encourage to use consolidating WBRT as a part of first-line therapy of PCNSL: the omission of WBRT did not significantly shorten overall survival, and in addition, early WBRT was associated with a reduction of patients’ self-perceived quality of life and general cognitive function. Therefore, new concepts for consolidation therapy after HD-MTX-based chemotherapy in PCNSL patients have to be explored.