Introduction

Myelodysplastic syndromes (MDS) are a group of clonal hematopoietic disorders characterized by inefficient differentiation, abnormal blood counts, and a tendency to develop acute myeloid leukemia (AML) [1, 2]. Clinical manifestations of MDS are heterogeneous and range from indolent disease with mild cytopenias and a life expectancy measured in years to more aggressive disease with profound cytopenias, frequent AML progression, and survival measured in months. Accurately predicting the outcome of patients with MDS is clinically essential. Treatment recommendations and consensus guidelines are based on risk stratification of patients into lower and higher-risk categories [3,4,5,6]. The goals of therapy for lower risk disease are to improve quality of life and decrease transfusion dependency whereas prolonging overall survival (OS) and delaying AML progression are more pressing needs for patients with higher-risk disease [3,4,5,6]. Accurately predicting disease outcomes is also important from the patient perspective as it sets expectations regarding disease severity and its likely impact.

Several prognostic models have been developed in the last two decades to aid physicians in risk stratifying MDS patients [7,8,9,10,11] (reviewed elsewhere [12, 13]). These models rely on clinical variables derived from a bone marrow biopsy evaluation and peripheral blood counts. Some models include patient characteristics such as age, co-morbidity, and performance status. More recently, several recurrent somatic mutations have been identified in MDS and myeloid malignancies with an impact on OS that is independent of clinical measures [14,15,16,17]. Attempts to build molecular prognostic models or to combine molecular data with existing prognostic models have been made, although the generation of a widely accepted molecularly integrated scoring system for MDS remains a work in progress. In this review, we will discuss the most commonly used prognostic models in MDS and how the addition of molecular data may improve their predictive power in clinical practice.

Prognostic Models in MDS

All commonly used prognostic scoring systems in MDS consider clinical variables that are either patient-related or disease-related (Fig. 1). Patient-related factors can include age, performance status, and co-morbidities. Disease-related factors can be divided into the following: (1) pathological features of the disease such as WHO classification, bone marrow blast percentage, cytogenetic analysis, and flow cytometric measures; (2) laboratory measures such as hemoglobin, absolute neutrophil count, platelet count, ferritin, LDH, peripheral blast percentage, and albumin level; and (3) biological factors that include molecular data obtained from DNA sequencing, RNA sequencing, methylation profiling, and microRNA profiles (Fig. 1). These features represent the pathogenic mechanisms responsible for disease phenotypes and are therefore strongly associated with the overall prognosis. For example, a patient with multilineage dysplasia (MDS-MLD) can have a lower-risk disease when it is associated with good-risk cytogenetics, low blast percentage, and mild cytopenias, but would have higher-risk disease if they had a poor risk karyotype and severe pancytopenia. The impact of each prognostic factor is often additive and can contribute to the heterogeneity of disease presentation.

Fig. 1
figure 1

Prognostic factors in MDS. The figure shows how the prognostic factors can be divided into disease-related and patient-related factors. Abbreviation: FC = flow cytometry, PS = performance status, MDS = myelodysplastic syndromes, MDS-SLD = MDS with single lineage dysplasia, MDS-MLD = MDS with multilineage dysplasia, RS = ring sideroblasts, EB = excess blasts, U = unclassifiable

The most commonly used models in clinical practice and for clinical trial eligibility consider primarily factors derived from pathological studies and laboratory values. These models include the International Prognostic Scoring System (IPSS) [8], the revised-IPSS (IPSS-R) [9], the World Health Organization (WHO) classification-based Prognostic Scoring System [18], MD Anderson Lower Risk Prognostic Scoring System (LRPSS) [11], and the MD Anderson Global Prognostic Scoring System (MDAPSS) [10].

International Prognostic Scoring System

The IPSS was developed in 1997 based on 816 patients with de novo MDS who received only supportive care [8]. The model considers three measures: conventional cytogenetics, bone marrow blast percentage, and the presence of cytopenias [8]. The IPSS remains one of the most widely used models in clinical practice given its ease of application and history as a risk stratification tool for clinical trial eligibility. Nevertheless, the IPSS has several limitations. It is not valid for patients with secondary/therapy-related MDS or proliferative chronic myelomonocytic leukemia (CMML) as these patients were excluded from its training cohort [19,20,21]. The IPSS may not be a dynamic tool either as it was developed in patients at diagnosis who did not receive disease modifying therapies like lenalidomide, hypomethylating agents, or stem cell transplantation. Therefore, its applicability later in the disease course, particularly at the time of hypomethylating agent failure, may be limited [22]. Finally, the IPSS does not account for the severity of cytopenias and thus may underestimate the prognosis in some patients with otherwise lower-risk features.

World Health Organization Classification-Based Prognostic Scoring System

The WHO classification-based Prognostic Scoring System (WPSS) uses pathological, clinical, and patient-related factors that include WHO subgroups, conventional cytogenetics, and the degree of anemia [18]. This model has been validated at times other than diagnosis, making a dynamic risk assessment tool, although its performance after HMA treatment may be limited [22]. Similar to the IPSS, patients with secondary/therapy-related MDS were excluded from the analysis, limiting the applicability of this model in this patient population.

MD Anderson Global Prognostic Scoring System and Lower Risk Prognostic Scoring System

The MDAPSS was the first model to include treated and untreated MDS patients as well as patients with proliferative CMML and secondary/therapy-related MDS [10]. The model considers pathological, clinical, and patient-related factors that include the following: bone marrow blast percentage, chromosome 7 abnormalities or complex karyotypes, platelet count, hemoglobin level, white blood cell count, performance status, and age and history of prior transfusions [10]. Despite excellent performance and broad inclusion criteria, the relative complexity of this model has limited its use in clinical practice or for trial eligibility.

The MDA Anderson Lower-Risk Prognostic Scoring System was developed to address the limitations of the IPSS and more accurately stratify patients with IPSS lower-risk disease [11]. The model uses variables such as age, bone marrow blast percentage, and cytogenetics as well as accounting for the severity of anemia and thrombocytopenia [11]. Approximately 25–30% of patients with IPSS lower-risk disease will be upstaged into a higher-risk category by the LRPSS. These patients have a predicted OS similar to that of patients with higher-risk disease by the IPSS. This becomes clinically important as the choice of therapy is highly dependent on prognosis and identifying the actual risk in these patients could alter their treatment recommendations [23, 24].

The Revised International Prognostic Scoring System

In 2013, the International Working Group has revised the IPSS to address some of its described limitations. The IPSS-R was developed in more than 7000 untreated patients with de novo MDS. Although it uses similar prognostic factors to those in the IPSS, it considers more comprehensive cytogenetic risk categories, different blast percentage cutoffs, and most importantly, the severity of each cytopenia [22]. While the IPSS-R was generated from a cohort of untreated patients, it has been validated in patients after first line therapy with an HMA, lenalidomide, or allogeneic stem cell transplantation. However, it is not as accurate in therapy-related MDS and its application at the time of HMA failure is limited [22, 25,26,27,28]. Finally, age is not formally included in the IPSS-R. Age has a significant impact on overall survival and can be taken into account by modifying the IPSS-R risk score using this formula: (years − 70) × [0.05 − (IPSS-R risk score × 0.005)] [22].

Prognostic Impact of Mutations on Overall Survival in Patients with MDS

Targeted and larger scale next generation genomic sequencing techniques have largely defined the genomic landscape of MDS [14,15,16,17, 29, 30]. Recurrent mutations affect several biological pathways including RNA splicing, DNA methylation, and chromatin modification, among others. Mutations have been shown to impact the pathophysiology of MDS. Many are associated with disease phenotypes, and several have an impact on OS and AML transformation risk (5–8) (Table 1). One or more typical mutations can be found in nearly every MDS patients if a genomic panel of 40 genes or more is examined. Associations between mutations and clinical/pathological variables have been described. For example, TET2 mutations occur more frequently in patients with normal karyotype and its occurrence with SRSF2 or ZRSR2 mutations is highly specific for CMML [16, 17]. Mutations of SF3B1 are very common in patients with ring sideroblasts and are the only mutations considered prognostically favorable [16, 17, 31, 32]. Several mutations have significant impact on OS independent of clinical variables (Table 1). In a study of 944 MDS patients, genome sequencing of 104 genes showed that 25/48 mutations were negatively associated with OS including: PTPN11, NPM1, TP53, PRPF8, EZH2, LUC7L2, NRAS, KRAS, FLT3, RUNX1, NF1, LAMB4, GATA2, ASXL1, SMC1A, and STAG2 with only SF3B1 having a positive impact on OS [17]. However, after adjusting for known clinical risk factors, only five mutations ASXL1, KRAS, PRPF8, SF3B1, and RUNX1 remained significant suggesting a significant overlap between the clinical and mutational data [17]. Not all mutated genes carry similar prognostic significance, and their impact on OS can change depending on the clinical context in which they are identified. In a large meta-analysis of 3562 MDS samples collected from 19 institutions across the globe, mutations in several genes were associated with significant differences in OS after adjustment for IPSS-R risk groups. Interestingly, the independent impact of many of these mutated genes was found only in certain contexts [14]. For example, SF3B1 mutations were strongly associated with a favorable impact on OS in patients with less than 5% bone marrow blasts even after adjustment for IPSS-R risk groups. However, this association was lost in patients with higher blast percentages [14]. Similarly, mutations in ASXL1, U2AF1, and SRSF2 had a negative impact on OS in patients with blast percentages < 5% but lost their independent significance in patients with higher blast percentages. Overall, mutations in 12 genes were independently associated with OS including: TP53, RUNX1, EZH2, NRAS, SF3B1, CBL, ASXL1, TET2, IDH2, KRAS, and NPM1. In a multivariable analysis that included all mutated genes, mutations of TP53, RUNX1, EZH2, NRAS, and SF3B1 remained independently significant after adjustment for IPSS-R risk categories [14]. While useful to identify the prognostic value of individual gene mutations, this approach does not account for the impact of multiple mutations or the potential interactions between co-existing mutations on OS.

Table 1 Commonly mutated genes in MDS and their prognostic impact in different bone marrow blast contexts

Moreover, the impact of a given mutation on overall outcome is not binary, as it can differ based on mutation characteristics such as variant allele frequency (VAF), mutation location, the type of mutation (missense vs. others), and the presence of co-mutated genes. For example, patients with TP53 mutations have a poor OS in general, although patients with a VAF < 25% had significantly better OS compared to patients with VAF > 50% with median OS of 12.4 months compared to 3.4 months, respectively [33].

Somatic mutations can also be used to predict outcomes after allogeneic stem cell transplantation [34,35,36,37]. Several studies have shown that TP53 mutations were associated with dismal overall survival after transplantation with death almost always caused by relapsed or refractory disease. In a large cohort of 1514 patients with MDS who were enrolled in the Center for International Blood and Marrow Transplant Research Repository between 2005 and 2014, TP53 mutations were associated with shorter overall survival and shorter time to relapse even after adjustment for known clinical risk factors [37]. The adverse nature of TP53 mutations was independent from the conditioning regimen intensity and patient age. Interestingly, in patients ≥ 40 years old with wild type TP53, the presence of RAS pathway mutations was associated with higher risk of relapse and inferior outcome, and the presence of JAK2 mutations was associated with higher risk of death without relapse and shorter OS [37]. In a similar study of 797 patients with MDS who received allogeneic stem cell transplant via the Japan Marrow Donor Program, complex karyotype or mutations in TP53 or RAS-pathway genes were also identified as independent prognostic factors associated with inferior outcome post-transplantation [36]. Survival of patients with both TP53 mutations and a complex karyotype was particularly poor and characterized by frequent early relapses. However, the outcome was slightly better for patients with TP53 mutations without complex karyotype. In this study, the negative impact of RAS-pathway mutations was mainly observed in patients with myelodysplastic/myeloprolifrative neoplasms, disorders in which such mutations are more common. Nevertheless, long-term survival could be obtained in some patients with TP53 mutations, suggesting that these lesions should not constitute an absolute contraindication to transplant [36]. Given the lack of alternative therapeutic options for these patients, approaches that might minimize the risk of relapse should be considered [33].

Incorporation of Molecular Data into Current Prognostic Models

Given the independent impact of several somatic mutations on OS, several efforts to build molecular prognostic models that incorporate molecular data have been made. Haferlach et al. used Cox regression analysis to evaluate the impact of multiple gene mutations/deletions alone or in combination with common clinical variables. The authors divided the cohort of 786 patients into training (611) and validation (175) cohorts to generate their models [17]. A total of 14 genes along with age, gender, and clinical variables derived from the IPSS-R were used to construct a model in which patients could be classified into four risk groups with predicted 3-year survival of 95.2, 69.3, 32.8, and 5.3%, for low, intermediate, high, and very high risk groups, respectively, p < 0.001 [17]. When the survival analysis was limited to only genetic mutations, 13 of the 14 genes that were included in the geno-clinical model were selected to build a genomic-only model. Interestingly, the geno-clinical model outperformed IPSS-R in the training and validation cohorts whereas the genomic-only model performed similarly to IPSS-R, suggesting that models combining clinical and molecular data can outperform those that include only one type of information [17].

An alternative approach uses molecular data to refine predictions made with current prognostic models to improve their predictive power. For example, the presence of mutations in TP53, EZH2, ETV6, RUNX1, or ASXL1 can effectively upstage patients with IPSS low, intermediate-1, or intermediate-2 risk to one risk category [38].

In an effort to add molecular data to IPSS-R scoring system, the clinical and mutational data of 508 MDS patients treated at the Cleveland Clinic between 2000 and 2012 were analyzed. Using a panel sequencing 62 genes, mutations in ASXL1, RUNX1, TP53, EZH2, SRSF2, and NPM1 were significantly associated with a negative impact on OS, whereas mutations in SF3B1 were associated with positive impact. In multivariate analyses, only age, IPSS-R score, EZH2, SF3B1, and TP53 remained significant (p < 0.05). Based on the beta-coefficients of these prognostic factors, a linear risk score was developed: age × 0.04 + IPSS-R score × 0.3 + EZH2 × 0.7 + SF3B1 × 0.5 + TP53 × 1. This new model separated patients into four risk groups with median OS as shown in Table 2. An improvement of the C-index of the new model was observed compared to the IPSS-R alone, indicating improve prognostic accuracy. More importantly, the addition of molecular data to IPSS-R created a dynamic model with improved predictive power when applied to paired samples obtained at different time points during the disease course.

Table 2 Cutoff and median overall survival estimates for risk groups defined by combining IPSS-R score and mutations according to the combined Cleveland clinic model formula (gene name = 1 if mutation is present, 0 if mutation is absent)

To further investigate whether the addition of molecular data to all established prognostic models can improve its predictive power without altering the original model scoring system, a cohort of 610 MDS patients who were treated at Cleveland Clinic and had clinical and mutational data was analyzed [39]. In univariate analyses, mutations in EZH2, TP53, RUNX1, NPM1, and SF3B1 had a significant impact on OS [39]. However, after adjusting for model score and age, only three mutations, EZH2, SF3B1, TP53, remained prognostically significant. Adding these three mutations along with age to all models improved its predictability but more importantly upstaged or downstaged patients into more appropriate risk categories. The addition of molecular data to the IPSS upstaged 37% of patients to a higher-risk category and downstaged 5% of intermediate-1 to low-risk disease. When molecular data was combined with the WPSS, 21% of patients were upstaged and 24% downstaged. For the MDAPSS, 19% were upstaged and 22% downstaged from intermediate-1 to low risk. Finally, combining molecular data with the IPSS-R, 26% of patients were upstaged to higher-risk disease including 62% of patients with intermediate risk who moved to a higher-risk category. This approach shows how the addition of gene mutations to current prognostic models can improve their predictive power without necessarily having to alter the original model [39].

Conclusions

Risk stratification systems are among the most important tools used in the clinical care of patients with MDS. Current models are highly predictive of outcomes in studies of large patient cohorts but may be associated with more uncertainty when applied to individual patients in practice. Incorporating molecular data into current models can improve their predictive power and reassign a substantial fraction of patients to more appropriate risk categories. However, there is no single consensus method for integrating clinical risk factors and genetic mutations, and the optimal method for incorporating mutational data into clinical prognostic models is still being developed. Given the interaction between mutations and disease features, an integrated model may look very different from those we use today. Novel approaches could include reclassifying MDS into distinct clinical and molecular contexts, each with its own relevant prognostic features. The International Working Group for MDS is developing such a model which should help standardize our approach to clinicogenetic risk stratification for MDS. In the meantime, somatic mutations can help us qualitatively adjust how we predict the prognosis and set expectations for our patients with MDS.