Introduction

Follicular lymphoma is the second most frequent non-Hodgkin lymphoma. While generally considered the prototypic indolent lymphoma, follicular lymphoma is characterized by significant heterogeneity in clinical presentation, disease biology, and overall prognosis [1, 2]. In addition to the heterogeneity in clinical and biological characteristics, there is significant variability in the therapeutic approach, as demonstrated by the National LymphoCare Study [3].

Follicular lymphoma heterogeneity has been an important focus of clinical, translational, and basic research over the last several decades, in an attempt to systematize this diverse disease entity. A definitive systematization of follicular lymphoma is not yet at hand, but there have been significant strides towards identifying patients who require initial therapeutic intervention as well as those who are at higher risk of disease progression and therapeutic failure after treatment has been started. While we probably do not have yet a fully personalized, risk-adapted approach available for all follicular lymphoma patients, it is abundantly clear that with currently available agents, a “one size fits all” approach is not adequate.

Research conducted in the last decade has identified response to first-line therapy [4] as a strong predictor of long-term outcome. Observations from the National LymphoCare Project also identified early relapse and progression of disease in the first 24 months after R-CHOP chemotherapy (termed “POD24”) as a predictor of poor survival in FL, which occurs in approximately 20% of patients [5••].

In this article, we will delineate the currently available diagnostic and prognostic parameters that inform prognosis prior to first-line therapy.

Diagnostic Considerations

Tissue Diagnosis and Histologic Grade

Follicular lymphoma presents in the majority of cases with asymptomatic waxing and waning lymphadenopathy that is often present for prolonged periods of time prior to diagnosis. Advanced stage disease is present in over 80% of cases [3] and bone marrow involvement (in the form of paratrabecular aggregates) is reported to occur in approximately 70% of patients. Despite being diagnosed in advanced stages, less than a third of patients report B symptoms at diagnosis, and symptomatic, macroscopic solid, organ involvement is rare.

Adequate histologic assessment at the time of diagnosis is fundamental. Excisional biopsy of an affected lymph node is necessary to assess architecture and define the histologic grade [6, 7]. Less desirable tissue diagnosis methods include core biopsies, as long as some tissue architecture is preserved, whereas fine needle aspirates should be avoided when there is suspicion of an indolent lymphoma [7].

The WHO classification identifies three separate grades of FL based on the proportion of centrocytes (small to medium sized cells) and centroblasts (large non-cleaved cells) present in the tissue [8]. Grading increases from 1 to 3 based on the number of centroblasts per high power field (hpf): Grade 1 has 0–5 centroblasts per hpf; grade 2 has 6–15 centroblasts per hpf; and grade 3 presents more than 15 centroblasts per hpf. Further subclassification of grade 3 is done into grade 3A with residual centrocytes, and grade 3B with sheets of centroblasts [9]. Histologic grade at diagnosis correlates with biologic and clinical behavior [10]. The correlation of histologic grading with prognosis continues to be debated [11], and more recent versions of the WHO classification have grouped FL grades 1 and 2 together because of the absence of clinical differences and lack of interobserver reproducibility [12]. Recent gene expression profiling (GEP) studies confirm that FL grades 1–2 can be grouped together despite heterogeneity in proliferation indices [13] but have not yet clarified the demarcation between grades 1–2, grade 3A, and grade 3B. Piccaluga and coworkers reported FL grades 1–2 and 3A clustered together and were separate from FL grade 3B, which in turn is also distinct from germinal center B cell DLBCL [14]. Horn and colleagues conducted an analysis that included immunohistochemistry, fluorescence in situ hybridization, and gene expression profiling, finding FL grade 3A clustered with grade 3B and had differential gene expression from FL grade 1–2 [13]. Population-based cohort studies from Sweden suggested that FL grade 3A had similar clinical course than grades 1–2, while FL grade 3B had more aggressive course and lower overall survival, but decreased rate of relapse beyond 5 years. Anthracycline containing first-line therapy resulted in improved outcomes in FL grade 3B [15]. More recently, investigators from the SWOG Cooperative Group reported on the prognostic role of histologic grade on a cohort of FL patients treated with cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP) plus rituximab vs. CHOP plus (131)iodine-tositumomab. As with most FL studies, the majority of patients were FL grades 1–2 (452/491, 92%); 7.9% corresponded to grade 3A, and one FL grade 3B was excluded from the analysis due to low number. There were no differences in 10-year overall survival (OS) or progression-free survival (PFS) between FL grades 1–2 and FL grade 3A [16]. These results combined with the population-based findings discussed above [15] suggest that FL grades 1–3A have similar outcomes when treated with anthracycline-based chemoimmunotherapy regimens. Over the last decade, randomized trials have evaluated bendamustine-based chemoimmunotherapy regimens for treatment of FL [17,18,19], but these trials have generally excluded patients with FL grade 3 (A or B). Given these data, our therapeutic approach to FL grade 1–2 is based on disease burden as discussed in subsequent sections, whereas our management of FL grade 3B includes anthracycline-based chemoimmunotherapy upon diagnosis. Management of FL grade 3A patients is less clear, although our general approach is to offer anthracycline-based chemoimmunotherapy based upon disease burden indications.

Disease Staging and Tumor Burden

Upon confirmation of the diagnosis of follicular lymphoma, subsequent actions should ensure to obtain correct staging. Physical examination, with focus on lymphatic nodal sites, spleen, and liver, should be completed by imaging studies. Staging images may include computerized tomography of the chest, abdomen, and pelvis, and may include the neck and other areas depending on clinical presentation. In our practice, we prefer positron emission tomography/computed tomography (PET/CT)-based imaging as initial staging procedure as it may identify occult sites of disease [20] and because of its prognostic implications discussed in a later section. A bone marrow biopsy should be performed to determine involvement in subjects for whom treatment is planned; this procedure should also be considered in patients with early-stage disease for whom upstaging would change treatment plans.

Disease stage has prognostic and therapeutic implications. While only 10–15% of patients are diagnosed with early-stage disease [3, 6, 7], the therapeutic choices differ from advanced stage disease. The frequency of “early” or “limited,” and “advanced” stage will depend on the definitions on the staging methods. Investigators from Stanford University reported that staging based on physical exam and radiologic tests (before availability of PET/CT) classified approximately 30% of patients as stages I–II, but use of bone marrow biopsy and laparotomy/splenectomy decreased this proportion to 12% [21]. The introduction of FDG-PET/CT imaging can provide with more accurate staging [22, 23]. If the disease can be encompassed in a single radiation field (i.e., stage I or contiguous stage II), radiation therapy can be offered with curative intent, with treatment given upon diagnosis, regardless of prognostic factors [24]. There are no prospective, randomized trials, but long-term follow-up studies suggest that disease-free survival (DFS) and overall survival (OS) after IFRT for early-stage FL are excellent, with 10-year DFS rates of 40–50% and 10-year OS of 60–65% [24,25,26,27]. More recently, the LymphoCare Study examined patterns of care FL patients, observing that less than a third of early-stage patients were offered radiation therapy alone, either in community or academic medical centers [3]. Systemic treatments (rituximab alone, chemoimmunotherapy, and combined modality therapy) had excellent outcomes; they are an appropriate therapeutic choice for early-stage FL patients [28].

As mentioned before, 80% of FL patients will present with advanced stage disease. The most widely adopted risk stratification in FL utilizes tumor burden to determine the need for immediate therapy or for continued observation (i.e., “watch and wait”). The Groupe d’Etude des Lymphomes Folliculaires (GELF) criteria were established after the GELF-86 trial [29]. This study was aimed at evaluating therapeutic strategies for FL patients with low tumor burden defined as nodal or extranodal disease sites with diameter less than 7 cm, involvement of fewer than 3 nodal sites 3 cm or less, absence of systemic symptoms, no spleen enlargement, no lymphomatous serosal effusions, no local risk of organ compression, and no leukemic disease state or cytopenias. Among 541 patients, 195 (36%) of enrolled FL patients had low tumor burden. Patients with low tumor burden had better 5-year overall survival (78%) compared with those with high tumor burden (57%). Deferred treatment of low tumor burden FL patients did not result in inferior overall survival, thus establishing this as the optimum therapeutic approach for patients this subset of patients. While they were designed as markers to indicate low tumor burden, patients meeting GELF criteria had lower overall survival [29] and and these criteria are used in clinical practice as indicators for therapy initiation. More recent iterations include the presence of elevated lactate dehydrogenase or beta 2 microglobulin as markers of increased tumor burden [1]. The use of the GELF criteria as measures of disease burden remains valid in the era of rituximab [30] and newer chemoimmunotherapy regimens such as bendamustine and rituximab [31].

Clinical Prognostic Markers

The use of clinical parameters for risk prediction and prognostic estimation in FL has been well established [1, 32, 33]. Well-validated clinical risk indices using baseline disease characteristics provide a general estimation of an individual patient’s risk but have not yet been applied as guides for initiation of therapy or for choice of specific therapeutic agents [2]. The increasing use of functional imaging with FDG PET/CT in FL has improved the accuracy of disease staging and tumor burden assessment at baseline [34] and has prompted studies evaluating the use of PET/CT-based measures for prognostic evaluation of FL.

Clinical Risk Indices Using Baseline Parameters

After the international prognostic index (IPI) was established for aggressive lymphomas, retrospective studies suggested that this index could be applied to indolent lymphomas [35, 36], but it was recognized that it was not an optimal discrimination tool [37]. A large multicenter analysis then followed, aimed at developing the follicular lymphoma international prognostic index (FLIPI). The final analysis included data from 4167 FL patients, with a median follow-up of 7.5 years [32]. The factors identified in the Cox regression analysis included age ≥ 60 years, Ann Arbor stage III or IV, hemoglobin level < 12 g/dl, lactate dehydrogenase above the upper limit of normal, and involvement of more than four nodal sites. The FLIPI score identified three risk groups: low risk with 0 or 1 factor (36%), intermediate risk with 2 factors (37%), and high risk with ≥ 3 risk factors (27%). Ten-year overall survival was 70.7% for low-risk patients, 50.9% for intermediate-risk patients, and 35.5% for high-risk patients (Table 1).

Table 1 Clinical and clinicogenetic prognostic indices in follicular lymphoma

Of note, some parameters were not included in the analysis because they were not consistently collected, including erythrocyte sedimentation rate (more frequently collected in European patients), ECOG performance status (because of unexplained difference between European and US centers), and beta 2 microglobulin and serum albumin (high proportion of patients with missing data).

The FLIPI was developed from retrospective data using patients prior to the rituximab era. Subsequent prospective studies demonstrated that the FLIPI retains its predictive capacity in patients treated with chemoimmunotherapy, including those treated with rituximab and CHOP in a randomized clinical trial conducted by the German Low Grade Lymphoma Study Group [38], and from the National LymphoCare Project [39], which included more than 2200 patients treated with rituximab-containing regimens in their first line.

In contrast to the FLIPI, which was based on retrospective patient data and with OS as main endpoint, the FLIPI-2 score was generated based on a prospective study with PFS as the main endpoint [33]. The study included 942 treated patients (559 [59%] treated with rituximab containing regimens). Patients assigned to watchful waiting were excluded since PFS was not considered a relevant endpoint for this group. The pre-treatment variables comprising the FLIPI-2 are age older than 60 years, hemoglobin lower than 12 g/dl, beta 2 microglobulin higher than upper limit of normal, lymph node diameter longer than 6 cm, and bone marrow involvement. Patients with no risk factors or low risk had 5-year PFS of 80%. Those with one to two risk factors or intermediate risk had a 51% 5-year PFS, and those with three or more risk factors were high risk, with 19% 5-year PFS. In the same cohort, the FLIPI2 was highly predictive of PFS, including the subgroup of patients treated with rituximab (Table 1). The risk groups defined by FLIPI2 also had different overall survival.

Recently, investigators evaluated a simplified risk scoring system including β2-microglobulin and the presence of bone marrow involvement [40•]. The training population consisted of 1135 patients treated with chemoimmunotherapy within the PRIMA trial (regimens included R-CHOP, R-CVP, or R-FCM). The primary endpoint measured was PFS. Three risk groups were identified: low, intermediate, and high. Low-risk patients had neither elevated β2-microglobulin > 3 mg/dl nor bone marrow involvement, intermediate-risk patients had bone marrow involvement but no elevation in β2-microglobulin, and high-risk patients had β2-microglobulin 3 mg/dl or above. The progression-free survival was 69%, 55%, and 37% within these groups, respectively. This score, termed the PRIMA prognostic index (PRIMA-PI), was validated in a cohort from the University of Iowa/Mayo Clinic Lymphoma Specialized Program of Research Excellence Molecular Epidemiology Resource. The investigators also found a strong correlation between the PRIMA-PI and event-free survival at 24 months (Table 1).

Both indices have their specific applications: The FLIPI assigns all patients at diagnosis to risk groups that are associated with overall survival, whereas the FLIPI-2 assigns all patients requiring treatment upon diagnosis to risk groups that predict PFS. The FLIPI remains more widely used in part because of its validation in prospective trials and observational settings, and also because beta 2 microglobulin is not checked by all centers [2].

Despite the prognostic value, most patients will not have specific therapeutic decisions made based on FLIPI, FLIPI-2, or PRIMA-PI scores. The absence of strategies that incorporate these indices in treatment decisions is probably the largest barrier to their wider adoption.

Response-Based Prognosis

Although not a “baseline” prognostic factor, response to initial therapy and particularly the development of disease-related events early after treatment have been identified as important prognostic factors. In the National LymphoCare Project, patients with disease progression within 24 months (POD24) after initial treatment with R-CHOP had lower 5-year OS (50% vs. 90% in patients without progression). The higher risk of mortality for patients with POD24 was present after adjusting for FLIPI score (HR 6.4, 95% CI 4.3–9.6) [5••]. The prognostic relevance of POD24 has been validated in patient cohorts treated with other chemoimmunotherapy regimens [5••,41•], and more recently, authors from the Cancer and Leukemia Group B (CALGB) (now known as Alliance for Clinical Trials in Oncology) reported that POD24 was predictive of survival in patients treated in a series of phase II trials examining combinations of rituximab with targeted agents (galiximab, epratuzumab, and lenalidomide), with early progressors presenting 5-year survival of 74% vs. 90% [42]. Investigators from University of Iowa/Mayo Clinic Lymphoma SPORE Molecular Epidemiology Resource (MER) and Lyon, France, validated event-free survival (EFS) at 12 months and 24 months after diagnosis as predictors of survival in FL patients [43]. Patients failing to achieve EFS12 or EFS24 had significantly increased mortality compared with age- and sex-matched general population, whereas patients reaching these milestones without events did not have excess mortality. The predictive role of EFS12 was present for patients treated with chemoimmunotherapy, rituximab, and those who underwent initial observation.

These response-based prognostic assessments have identified the subgroup of FL patients with highest need of more effective therapies. The limitation of these response-based values is that they represent, by their very nature, a posteriori measures of patient risk and cannot inform decisions of initial treatment. However, they are important treatment endpoints, and research now is focused on identifying prognostic factors that can predict early progression of disease; in addition, clinical trial design research is now evaluating whether sustained remissions 30 months after treatment can be an adequate surrogate for progression-free survival [44].

Baseline Functional Imaging Prognostic Prediction

The use of FDG PET/CT scan is recommended for initial staging of FL. This imaging modality has higher sensitivity and preserved specificity for diagnosis of nodal and extranodal disease [45]. The total metabolic tumor volume (TMTV) is a quantitative measure of tumor burden that has prognostic value in Hodgkin [46] as well as several non-Hodgkin lymphoma subtypes [47, 48]. In a pooled analysis, Meignan and colleagues reviewed imaging and clinical data from three prospective FL trials that required FDG PET/CT at enrollment [49•]. Patients included in these trials met criteria for high tumor burden or advanced stage disease and received chemoimmunotherapy. Calculation of TMTV was done adding the metabolic volumes of all nodal and extranodal lesions meeting the threshold of 41% maximum standardized uptake value. Median TMTV of 297 cm3 (interquartile range 135–567 cm3) and 510 cm3 was found to be the optimal cutoff for prediction of OS and PFS. The calculation was highly reproducible between reviewers and within research groups. A TMTV above the threshold was associated with 5-year PFS and OS of 32.7% and 84.8%, respectively, compared with 65.1% and 94.7% for the group with TMTV below 510 cm3. An elevated TMTV correlated with advanced stage disease, higher nodal, extranodal, and bone marrow involvement as well as higher FLIPI and FLIPI2 scores, and higher β2-microglobulin and lactate dehydrogenase. Additional studies have shown that high TMTV was correlated with elevated circulating tumor cells and cell-free DNA, both of which have been associated with worse outcomes [45]. Retrospective studies suggest that SUVmax predicts time to first treatment in patients with low tumor burden FL [50] but does not correlate with outcomes in FL patients treated with RCHOP [51].

Biologic Risk Markers

Research of the last decades has led to better understanding of the biology of follicular lymphoma, with increasing recognition of its diversity, manifested by multiple genetic abnormalities beyond t(14;18). These abnormalities alter epigenetic regulation and cell-surface receptor signaling pathways that affect cell survival and regulate interactions between the FL cell and its surrounding microenvironment, all of which are associated with overall prognosis [1, 2, 52].

Genetic and Epigenetic Deregulation

The presence of translocation t(14;18)(q32;q21), which juxtaposes BCL2 to the immunoglobulin heavy chain enhancer region, is an early event that occurs during VDJ recombination of progenitor B cells in the bone marrow [53]. This is present in approximately 90% of cases of FL and can be labeled as a founding event [54] but is not sufficient for development of the lymphoid neoplasm [55]. Of note, the presence of t(14;18) hampers proliferation, and additional mutations are necessary for neoplastic transformation [54, 56].

A second founding event in FL corresponds to epigenetic deregulation, with the vast majority of cases (90–95%) [1, 54, 57,58,59,60] having at least one mutation involving an epigenetic regulator gene. Mutations in the H3K4 histone methyltransferase KMT2D (also known as MLL2) are second in frequency only to those involving BCL2 and occur in 75–80% of FL cases [1, 54]. This mutation not only confers a proliferative advantage [61] but also contributes to further genomic instability [62]. Additional mutations of histone modifiers include gain of function mutations of EZH2 (in approximately 25% of FL cases) which increase methylation of H3K27 histone. CREBBP mutations facilitate proliferation and increase the germinal center reaction [54, 63] (30–60% of FL cases) and EP300 (9% of FL cases), and encode histone acetyltransferases that mediate the acetylation of H3K27 [57], affecting proliferation and immune evasion of FL cells [54]. Additional recurrent mutations of epigenetic modifiers in FL include MEF2B, HIST1H1, ARID1A, and SMARCA4.

Cell Signaling Deregulation

Several signaling cascades play essential roles in B cell maturation and survival, including B cell receptor pathway, CD40, toll-like receptor, NOTCH, BAFF, EphA2, and others [54]. The combined abnormalities result in persistent proliferation and survival of FL cells. Evidence suggests that BCR signaling is of highest relevance for most FL cases, highlighted by the persistence of surface immunoglobulin expression despite inactivation of one allele by t(14;18). Post-transcriptional modification of immunoglobulin is common, and FL cases present higher rates of N-glycosylation and express specific IgVH genes with oligomannose motifs [54, 64], which affect the interaction of the FL cell with its microenvironmental partners, such as follicular dendritic cells (via mannose receptors and dendritic cell-specific ICAM-grabbing nonintegrin (DC-SIGN) receptors), resulting in non-internalization of the immunoglobulin and persistent, tonic BCR signaling [65, 66]. Intermediate signaling mediators of the BCR and phosphatidylinositol-3-kinase (PI3K) pathways can be affected by additional alterations of downstream mediators, such as amplification or mutation of CARD11 and PTEN loss [59]. Additional abnormalities include homologous deletion of TNFAIP3 affecting CD40 and TLR signaling [57] or loss of EphA7 forcing signaling through EphA1 [54].

Microenvironmental Changes

The microenvironment surrounding FL cells includes T cells (including T follicular helper [Tfh] cells and T regulatory cells [Treg]), follicular dendritic cells, monocytes, macrophages, and reticular cells [52, 54]. These cells not only provide a supportive stroma but also present signaling partners and increase local chemokine content, facilitating FL proliferation, survival, and immune escape. T follicular helper cells express CD40L and IL-4, which in turn increase CCL17 and CCL22 production by FL cells, facilitating migration of Treg cells to the tumor [67]. Presence of FOXP3+ Treg lymphocytes has been associated with worse outcomes [68, 69]. Increase in Treg content is part of a tumor-permissive microenvironment, which includes an increase in macrophage content. The latter can directly interact with FL cells via DC-SIGN and highly mannosylated immunoglobulin moieties [66]. Gene expression studies done in FL have found that non-malignant, stromal cell gene expression signature skewed towards a specific T cell gene expression pattern is associated with improved survival (relative risk = 0.15; 95% CI 0.05–0.46), whereas a gene expression skewed towards genes preferentially expressed in macrophages or follicular dendritic cells was associated with worse survival (relative risk = 9.35; 95% CI 3.02–28.9) [70]. An elevated tumor-associated macrophage content has been associated with worse prognosis in studies using immunohistochemical (IHC) techniques [71] as well as studies combining RT-PCR and IHC [72].

Clinico-Genetic Predictive Risk Indices

In an effort to integrate genetic studies into a validated prognostic score, an international collaboration performed deep sequencing of 74 genes with known recurrent mutations in FL [73••]. The training cohort in 151 FL specimens from patients enrolled in the GLSG2000 clinical trial and treated with RCHOP. Patients had advanced stage disease and had indications for therapy (B symptoms, bulky disease, impaired hematopoiesis, or rapid progression). Non-silent mutations were found on 146 (97%) of patients, with a median of four mutations per patient. The most commonly affected genes were KMT2D, CREBBP, BCL2, TNFRS14, and EZH2. Individual gene mutations had association with FLIPI on univariate testing, but not on multivariate analysis. Multivariable risk models predicting failure-free survival were tested, finding superiority of a model including seven genes as well as FLIPI (as binary high vs. low/intermediate) and ECOG performance status. This model, termed the m7-FLIPI, included seven genes: EZH2, ARID1A, EPI300, FOXO1, MEF2B, CREBBP, and CARD11 along with FLIPI and ECOG values. The score is calculated as the sum of the predictive values weighted by Lasso coefficients. An online m7-FLIPI calculator is available at http://www.glsg.de/m7-flipi/. The optimum cutoff of 0.8 was used for identifying high-risk patients who had significantly worse 5-year failure-free survival than low-risk patients (38.3% vs. 77.2%, p < 0.0001) and was predictive of overall survival (Table 1). The m7-FLIPI re-classified subjects previously assigned as high risk by FLIPI to a low risk category, primarily through identification of mutations in EZH2. This has proven to be the first well-validated clinicogenetic risk model in FL.

Conclusions and Future Directions

The last 3 decades of FL research have demonstrated that it is a highly variable disease entity, with significant biological diversity, resulting in very disparate outcomes. While basic research has identified recurrent genetic, microenvironmental, and signaling abnormalities in FL, clinical research has identified risk factors associated with outcomes, including progression-free survival and overall survival. Despite being validated in large cohorts, predictive indices such as FLIPI and FLIPI-2 can be used to define prognosis but are not useful in making therapeutic decisions. The GELF criteria remains a sole parameter for decision-making in the front-line management of FL.

While recent clinical studies have identified response to first-line therapy and early progression as important prognostic predictors, there are few tools that can identify patients at highest risk of failing first-line therapy [74]. The recently introduced clinicogenetic prognostic index M7-FLIPI combines information from biologic and clinical research, and provides more accurate prediction than previous clinical indices. A recent article by Jurinovic and colleagues, using the M7-FLIPI as a predictor of disease progression was able to identify the subgroup of FL patients at highest risk and with the largest unmet need [41•]. Additional research is needed to establish the therapeutic strategies that can overcome this elevated risk at baseline.

The identification of recurrent biologic abnormalities has guided drug development in FL, with several new signaling inhibitors being studied in clinical trials and receiving regulatory approvals. With the abundance of these inhibitors and immunotherapies entering the armamentarium, the development of validated specific biologic markers and predictive tests (as well as designing the trials to validate their use) to select the most effective targeted agents for treatment of FL will be a major objective for lymphoma research over the next decade.