Introduction

Multiple myeloma (MM) is a clinically and genomically heterogeneous disease. High-dose melphalan with autologous stem cell rescue (ASCT) led to substantially higher rates of complete response (CR) (up to 50 %) and was associated with improved progression-free (PFS) and overall survival (OS) [14] compared to patients receiving conventional chemotherapy. More recently, regimens incorporating novel immunomodulatory agents [57] and proteasome inhibitors have been associated with improved depths of response, PFS, and OS. Five-year survival now approaches 50 to 70 % in transplant-ineligible and transplant-eligible patients, respectively [5, 8, 9]. Monoclonal antibodies and epigenetic therapy may further increase the outcomes.

As the techniques for assessing disease burden and minimal residual disease (MRD) in MM have transitioned from protein electrophoresis and immunofixation (PEP/IFE) to multiparameter flow cytometry (MFC), polymerase chain reaction (PCR), fluorine-18 fluorodeoxyglucose positron emission tomography CT (PET-CT), and most recently, next-generation sequencing (NGS), the sensitivity has increased from the assessment of paraprotein in the range of 1–2 g/dL using serum/urine protein electrophoresis (PEP) to the sensitivity of 1 in 10−5–10−6 cells using real-time quantitative PCR and NGS [1013]. Direct comparison of paraprotein sensitivity to cell-based methods is difficult but can be made by correlating myeloma cell mass assessed by paraprotein secretion rate to cell numbers by MFC or PCR dilution studies [14, 15]. If the methods for assessing MRD become accessible, reliable, reproducible, and cost-effective, the definitions of depth of response and the significance of an MRD-negative CR will need to be revisited. More importantly, if an MRD-negative state can be achieved with new combinations of novel agents and shown to predict outcomes such as PFS and OS, this, in turn, will allow for (1) an improvement in risk-stratified approach to therapy and (2) an earlier identification of response to novel agents, particularly in the setting of clinical trials. This review addresses the use of highly sensitive techniques for assessing MRD and the clinical relevance of achieving MRD in the post-induction, post-ASCT or allogeneic stem cell transplant (allo-SCT), and relapsed/refractory disease states.

Serum and Urine PEP and IFE

Serum and urine PEP and immunofixation (IFE) are widely available methods used to measure disease burden in MM. Using these techniques, the definition of CR has evolved over time. With high-dose therapy, the definition of CR evolved from a >75 % reduction in paraprotein to include not only the disappearance of bone marrow (BM) clonal plasma cells (PC) but also the absence of urine and serum paraprotein by IFE. The clinical significance of achieving a CR by PEP/IFE has been well described [1, 3, 1622], perhaps most convincingly from two large meta-analyses in both the transplant-eligible and transplant-ineligible population that demonstrated a correlation between the depth response by PEP/IFE with long-term survival comes [23, 24]. It is unclear, however, whether or not treatment response is truly an independent variable or a surrogate for adverse molecular risk.

To answer this question, Haessler et al. conducted a multivariate regression analysis concerning 668 newly diagnosed MM patients who were uniformly treated with a tandem ASCT according to the TT2 protocol [25]. When only standard prognostic features were examined, attainment of a CR benefited patients regardless of risk factors defined by gene-expression profiling (GEP). However, using GEP risk stratification in a subset of 326 patients, a survival benefit of CR only pertained to the small high-risk subgroup of 13 % of patients (hazard ratio (HR) 0.23; p = 0.001), whereas the remaining majority of patients with low-risk disease had similar survival outcomes whether or not CR was achieved (HR 0.68; p = 0.128). Therefore, the prognostic value of a CR may not be a truly independent variable, but rather dependent on other risk stratification criteria.

Another limitation of correlating depth of response to outcomes in chronic diseases like MM is that effective treatment can arrest the progression of disease even in the absence of complete disease eradication. For this reason, landmark analysis, wherein all patients are followed based on intent-to-treat, is especially powerful [26]. The Southwest Oncology Group (SWOG) applied a landmark analysis to the 6- and 12-month outcomes of 1555 previously untreated MM patients enrolled onto four phase II trials based on response category. One third of the patients received novel agents (thalidomide, bortezomib, or lenalidomide) for induction, and the remainder received conventional alkylator-based chemotherapy [27]. At the 6- and 12-month landmarks, patients with progressive disease (PD) had an inferior outcome, as expected. However, in patients without PD, the median OS in responders was comparable to the nonresponders (i.e., those with stable disease) at 34 months [27]. Thus, the magnitude of response by PEP and IFE, as a single variable, did not predict survival duration.

Serum-Free (Freelite, The Binding Site Ltd., Birmingham, UK) and Heavy Light Chain Analysis (HevyliteTM)

Normalization of free light chains (FLC) have recently been incorporated into the International Myeloma Working Group (IMWG) definition of stringent CR (sCR) in addition to <5 % plasma cells in the BM with negative serum and urine IFE. The serum immunoglobulin FLC assay measures levels of free kappa or lambda immunoglobulin light chains and has been particularly useful in monitoring patients with light chain MM relative to the 24-h urine PEP and IFE. The achievement of sCR compared to CR after ASCT translated into a superior OS from the time of transplantation on multivariate analysis and at 2-year landmark [28]. Interestingly, the presence of normal FLCs is not always concordant with negative serum IFE. In 122 myeloma patients at various stages of disease and therapy [29], the sensitivity of serum FLC ratio in detecting the presence of a monoclonal protein by serum IFE was only 66 % with a specificity of 69 %. Therefore, aside from the time of diagnosis or documenting sCR, the FLC ratio may not be useful as it often may reflect the degree of immunosuppression rather than the tumor burden.

In the case of heavy chain MM, a new method was recently developed and validated for the separate quantification of the kappa- and lambda-bounded amounts of circulating IgG and IgA, to analyze the amount of Ig heavy/light chain (HLC) pairs. In a study examining sequential peripheral blood (PB) of 156 patients with IgG or IgA MM treated with induction therapy followed by ASCT [30], HLC ratio indicated the presence of disease in 8/31 patients who achieved CR, and in sequential studies, indicated evolving relapse in three patients before serum immunofixation electrophoresis (SIFE) became positive. Multivariate analysis revealed HLC ratio (p = 0.03) as an independent risk factor for OS. HLC ratios were also studied in 37 patients with MM in CR after ASCT [31]. Although increased IgAK/IgAL and IgMK/IgML ratios were associated with longer PFS, there was no statistically significant difference in OS. Further prospective studies are required to determine the appropriate use of this novel assay.

Immunophenotypic Response: Multiparameter Flow Cytometry

Post-ASCT

The overview of methodologic limitations discussed in part 1 of this series is helpful to understand and interpret the data supporting and contradicting the use of multiparameter flow cytometry (MFC) for MRD. MRD by MFC, also termed immunophenotypic response (IR), was assessed in 397/1114 patients for whom an additional BM sample was obtained who were part of the Medical Research Council (MRC) Myeloma IX Trial assigned to the intensive regimen consisting of alkylator-based induction therapy followed ASCT at day +100. The 397 patients who were selected for assessment of IR were patients for whom an additional BM aspirate was taken and for whom IR was evaluable. Absence of MRD by six-color flow at day +100 was predictive of a favorable PFS (p < 0.001) and OS (p = 0.0183), in patients with favorable and adverse cytogenetics and in patients achieving an IFE-negative CR [32]. In another preplanned subset analysis of a prospective study, 241 patients enrolled in the Spanish GEM2000 and GEM2005 trials who had achieved a CR post-ASCT and who had a BM sample available for immunophenotypic analysis at diagnosis and at day +100 were evaluated for MRD using four-color flow. Of the 789 patients enrolled in GEM2000, 140 of the 147 who achieved CR were included. Similarly, of the 386 patients enrolled in the GEM2005 study, 101/138 who achieved CR were eligible for MRD analysis. Persistent MRD by MFC at day +100 (HR 8.0, p = 0.005) along with high-risk cytogenetics by FISH (HR 17.3, p = 0.002) were the only independent factors that predicted for unsustained CR and a median OS of only 39 months. In contrast, those who achieved an immunophenotypic CR (i.e. MRD negativity by MFC) at day +100 demonstrated a 3-year OS of 98%. Of note, all 241 patients were evaluable for immunophenotypic response although no distinction was reported between MRD negative and inadequate BM samples [33]. More importantly, 5-year OS for those with an IR versus those without was 71 versus 60 %, respectively (p < 0.001). Therefore, persistent MRD (along with adverse cytogenetics) was able to discriminate a subset of CR patients with inferior outcome.

Maintenance Therapy

The effect of thalidomide maintenance was assessed in patients enrolled in the prospective MRC Myeloma IX study. Those patients who were MRD positive and did not receive maintenance had the shortest PFS. One third of those who were MRD positive but received maintenance thalidomide became MRD negative. Interestingly, the patients who were MRD negative and also received thalidomide maintenance had the longest PFS (p < 0.001) [32]. Aside from the interplay between thalidomide and high-risk genetics such as 17p deletion [34] and the fact that that these findings are from an unplanned subgroup analysis, the benefit of maintenance even in MRD-negative patients illustrates the importance of prospective randomized studies incorporating MRD-based results at the time of randomization to currently used maintenance therapies (lenalidomide and bortezomib) in order to truly understand the role of MRD in risk-adapted therapy.

In the Nontransplant-Eligible Population

In the United States, ASCT eligibility is defined by performance status and presence of serious comorbidities. In a prospective series of 102 elderly, ASCT-ineligible newly diagnosed MM patients who achieved at least a partial response (PR) with ≥70 % reduction in M-component, disease response was assessed by SIFE, FLC, and IR by MFC after six cycles of induction therapy. All patients were evaluable for response. Achievement of IR translated into superior PFS (median not reached versus 35 months) compared with conventional CR or sCR, although there was only a trend toward longer OS. After multivariate analysis for PFS, only IR status after induction was an independent factor [35]. In the MRC IX trial, MRD by six-color MFC was assessed in a subset of 245 of the 856 (29 %) nontransplant-eligible patients who received melphalan-prednisone (MP) or attenuated CTD (CTDa) and achieved CR. Of these patients, the presence of MRD was associated with inferior PFS (14.1 vs 34.3 months, p = 0.0068) and nonsignificant inferior OS compared to those in CR with absence of MRD.

Limitations of Immunophenotypic Response

In the nontransplant-eligible arm of the MRC IX trial, when MRD was evaluated according to categorical response, IR after induction did not correlate with PFS or OS, and there were discordant results between sCR and IR in 22 % of patients at the time of the MRD testing [32]. On the one hand, 14.5 % of 214 patients achieving IFE-negative CR had detectable disease by MFC. On the other hand, 25.6 % of 246 MRD-negative patients failed to achieve CR and (11.6 %) failed to achieve at least very good partial response (VGPR). These MRD-negative patients who failed to achieve a conventional CR had an outcome identical to that of those patients who were MRD positive.

Therefore, it appears that the prognostic power of IR by MFC depends on therapeutic regimen and clinical setting. MRD negativity was associated with prolonged OS in the transplant-eligible population and PFS in the maintenance setting, but is not a consistently predictive variable for PFS after induction therapy in unselected (i.e., not only in those responding) transplant-ineligible patients. The need for a repeat BM aspiration to assess IR is an inherent limitation in the use of IR outside of clinical trials, particularly in transplant-ineligible patients. Additionally, although IR by MFC provides meaningful information regarding prognosis, the absence of MRD does not always correlate with SIFE negativity or radiographic negativity as will be further discussed below, indicating a limitation in its sensitivity and specificity when assessing burden of disease.

One limitation common to both the MRC IX trial [32] and the Spanish GEM2000 and GEM2005 trials [33] in assessing for MRD is that only a subset of the patient population was examined for MRD by MFC. Although the subset analysis was preplanned, sensitivity and specificity can be compromised when the denominator is limited (as occurs when patients are excluded from the analysis) and thus introducing possible selection bias. Limiting the patients included to those who only achieve CR can also affect sensitivity and specificity of IR analysis, given the bidirectional discordance between categorical response and MRD, as seen in the MRC IX study.

Another theoretical limitation of MFC to assess for MRD is that most MFC methods begin with CD138 selection of the BM aspirate based on the expression of this antigen on mature plasma cells. However, Tiedemann et al. showed that primary MM tumors consist not only of plasma cells and plasmablasts but of earlier CD20+ B cell progenitor subpopulations that demonstrated partial repression of markers of plasma cell maturation, such as CD138, CD38, IL6R, and IL6ST [36]. BM aspirate samples taken from bortezomib-refractory patients were found to have a population of CD138-negative and XBP1-s-negative cells. Thus, current methods of MFC may be missing the various progenitor subpopulation clones of the MM tumor that may be crucial to future relapses.

Further, the sensitivity, specificity and applicability of flow cytometry is limited by a variety of technical factors, as described in part 1 of this series. Although clinical applicability has improved with the transition from four (GEM/PETHEMA)- to six (MRC)- to eight-color flow, and 10-color flow is now in development, other factors can provide barriers to its success. For example, time of sampling with respect to treatment (ideal time from obtaining aspirate to performing flow is <24 h), number of markers, number of cells counted (0.005–0.02 % cells), and marrow cellularity are all crucial technical factors that influence reproducibility. There is variation in immunophenotypic expression with treatment and time. Given the variability in methodology, the technique for performing MFC must be standardized and harmonized, with agreement on a validated panel of markers, before it is FDA approved for clinical use.

Molecular Response by PCR

The two major types of polymerase chain reaction (PCR) used to detect MRD by molecular response in MM are fluorescent (F)-PCR of immunoglobulin genes and allele-specific oligonucleotide (ASO)-PCR. In contrast to MRD detection by MFC where the initial myeloma clone does not need to be detected, PCR-based platforms and NGS rely on the collection of sample at times of high disease load, such as initial diagnosis or relapse. Plasma cells >5 % are usually necessary for detection of the myeloma clone at the time of calibration.

Fluorescent PCR of Immunoglobulin Genes

After Induction and Consolidation

The prognostic influence of achieving molecular response assessed by fluorescent PCR (F-PCR) was examined in 130 newly diagnosed MM patients who achieved ≥VGPR following first-line therapy in the elderly population or after induction plus ASCT in the transplant-eligible population from the GEM2000 and GEM2005 trials [37]. Two hundred thirty-eight out of 271 patients (87 %) whose BM samples were tested for F-PCR were evaluable for response. Patients who achieved a molecular response (defined as presence of a clonal peak at diagnosis and its absence during follow-up) had a significantly longer PFS as compared with nonmolecular response patients (median 61 vs. 36 months, p = 0.006). OS was also significantly longer in molecular response patients versus nonmolecular response patients (median NR, 5-year 75 % versus median 66 months, p = 0.03).

After Nonmyeloablative Allogeneic Transplant

MRD negativity by F-PCR has been shown in two small studies, of a total of 68/90 MM post-allo-SCT for which patient-specific IGH rearrangement primers could be identified, to predict for a longer 2-year OS (N = 20/20 evaluable) [38] and a longer 5-year relapse-free survival (N = 48/70 evaluable) [12]. Larger prospective studies are needed to confirm these findings.

Limitations of F-PCR

One limitation of F-PCR is that it may not offer a high enough sensitivity for identifying different risk categories [39]. In fact, ≤75 % of patients have a tumor marker that can be amplified by using PCR, which hampers its use in routine clinical practice [4043].

Applicability of IR appears much higher compared to molecular response with a total of 966/966 evaluable patients by MFC compared to 306/361 (84.7 %) of evaluable patients by PCR. In addition, in comparison to IR by MFC, frequency of IR was lower than F-PCR molecular response (18 versus 49 %) in the abovementioned PETHEMA study [37]. PFS was slightly higher in patients with IR versus molecular response (67 versus 61 months, respectively). Five-year OS showed similar findings: 95 versus 75 % for patients achieving IR versus molecular response. Thus, IR appeared to provide a deeper response compared to F-PCR.

Allele-Specific Oligonucleotide PCR

Post-Auto-SCT

Bakkus et al. evaluated ASO-PCR in 67/87 patients for which VH-Ig gene sequencing was attempted 3–6 months post ASCT and found the cut-off value of 0.015 % residual clonal cells divided patients with prognostically differing groups in terms of PFS (64 versus 16 months, p = 0.001) [44]. Multivariate analysis showed grouping the PCR results to be an independent prognostic factor for PFS. The clinical utility of MRD monitoring was also compared in 24/32 evaluable MM patients for which clonal cells were detected by ASO-PCR post ASCT. [40]. Clonotypic cells (MRD-positivity) were detected in 71 % of patients. The presence of low MRD, defined as MRD ≤0.01 % residual clonal cells, equivalent to a residual tumor load <10-4 cells displayed a longer PFS (34 versus to 15 months) as compared to those with higher MRD level (p = 0.042).

The predictive significance of quantitative ASO-PCR designed for each patient to match the hypervariable CDR3 region of the individual IgH was evaluated in comparison with that of IFE in 21/37 patients who had achieved CR/near-CR after autologous (N = 18) or allogeneic (N = 3) SCT for whom allele specific primers could be successfully designed. A threshold level of 0.01 % in the quantitative ASO-PCR assay 3–6 months after SCT was found to be a useful cutoff limit to divide the patients into two prognostic groups: MRD low/negative (≤0.01 %) vs MRD high (>0.01 %). Low/negative MRD after SCT was a significant predictive factor for PFS (70 vs 19 months; p = 0.003), although significance for OS was not reached [45]. Although not reported, a multivariate analysis would be useful in this setting to determine independent prognostic value.

Post-Allo-SCT

MRD negativity by ASO-PCR correlated with longer PFS following allo-SCT in a study of 12 MM patients who achieved sCR following allo-SCT [11]. Thirteen of 26 patients who were in CR following transplantation had BM aspirate samples that could be retrospectively evaluated for MRD by PCR. Another small study of 14/30 evaluable MM patients who achieved a clinical CR post-allo-SCT confirmed that MRD negativity post-allo correlates with a prolonged clinical remission [46].

Limitation of ASO-PCR

While some groups have found that patients achieving PCR negativity had a prolonged PFS after allo- or auto-SCT [12, 47, 48], other studies determined that clonotypic cells persist in virtually all MM patients after ASCT, preventing an efficacious prognostic correlation [49, 50]. A reason for this may be its very high sensitivity and perhaps lower specificity. Yet, despite the high degree of sensitivity, most patients who achieve MRD-negative status by quantitative PCR still eventually relapse. It is also essential when using quantitative studies to establish a validated threshold to separate high from low MRD levels to better allow establishment of prognostic variables [40]. Validation, however, is not a trivial issue. ASO-PCR is associated with high technical complexity and often low applicability [46]. Although some alternate PCR strategies (i.e., fluorescent PCR of Ig genes) for assessing MRD could improve the applicability, they typically result in decreased sensitivity.

Next-Generation Sequencing

A clonal rearrangement was identified, and the prognostic value of MRD detection was assessed in 121/133 MM patients treated within the GEM2000 or GEM05 < 65 protocols who achieved ≥VGPR after induction therapy using a sequencing-based platform, LymphoSIGHT method (Sequenta Inc.) [51]. Molecular response by deep sequencing, defined as MRD level <10−5, was associated with significantly longer time to progression (TTP) (median 80 versus 31 months, p < 0.001) and OS (median not reached versus 81 months, p = 0.02). In the multivariate analysis for TTP, MRD negativity by deep sequencing was the single variable with statistical significance (HR 8.6, p = 0.012). When limiting the study to the 62 patients who achieved conventional CR at time of analysis, 58 % of those in CR were positive by sequencing at MRD levels at 10−5 and higher. With a median follow-up of 42 months, patients in CR who were MRD negative by deep sequencing had significantly longer TTP (median 131 versus 35 months, p = 0.0009) compared with patients in CR who were MRD positive by sequencing. Median OS was not reached in either group. It is notable that those who were in CR and MRD negative by NGS achieved one of the longest PFS known to date (131 months) compared to historical controls. Thus, true MRD negativity speaks toward depth of response and may be a meaningful endpoint in terms of attaining an eventual cure.

In the abovementioned study, a high level of concordance was observed between MRD levels by deep sequencing and MRD by both MFC and ASO-PCR. All methods differentiated between MRD-positive high-risk patients and MRD-negative cases, which exhibited a favorable prognosis. MFC demonstrated the greatest applicability, with 102/102 evaluable patients, deep sequencing evaluable in 121/133 (91 %) patients, and molecular response evaluable in 238/271 (87 %) patients. Deep sequencing and PCR demonstrated the highest sensitivity, with MRD detectable at levels of 10−5 or higher, and ranged between 10−4 and 10−5 using MFC. It is important to note that the sensitivity of all approaches is limited by the amount and purity of DNA or the number of cells analyzed, which depend on specimen quality. As a result, the sensitivity will improve if more BM material is obtained for analysis, which is especially challenging in a disease where BM sampling is not uniform. Patients who were in CR and MRD negative by deep sequencing had a 5-year OS of 88 % compared to 78 % in patients who were CR and MFC negative. However, MRD by MFC has been shown to correlate independently with PFS and OS, and independent correlation has not yet been established using PCR or deep sequencing.

Multiple Myeloma Circulating Tumor Cells

Recent observations suggest that tumor cell dissemination, or metastasis, is often an early event, and the clinical consequences of circulating tumor cells (CTCs) in nonhematologic malignancies have been the focus of extensive research [52, 53]. Although the BM (required for IR, ASO-PCR, and NGS) is much more accessible than solid tumor tissue, most MM patients would rather not undergo a repeat BM aspiration. Fortunately, CTCs, defined as clonal PCs in PB, have also been shown to be prognostic. The methods of CTC detection include microscopy, flow cytometry, and NGS.

Newly Diagnosed

High levels of CTCs evaluated by slide-based immunofluorescence microscopy have been shown to be an adverse risk factor for progression from both MGUS (N = 325) and smoldering MM (N = 171) to symptomatic MM [54, 55]. The prognostic value of CTCs by MFC was examined in 302 patients with newly diagnosed multiple myeloma (by gating on CD38+ CD45− cells) who underwent alkylator- or thalidomide-based induction therapies with or without ASCT [56]. Circulating plasma cells (CPCs) were detected in 222/302 (73.5 %) of patients. The median OS for those who had ≤10 CPCs per 50,000 cells analyzed (N = 186) compared to those who had >10 CPCs per 50,000 cells analyzed (N = 115) was 58.7 versus 37.3 months (p = 0.01). The adverse prognosis of PCs remained independent on multivariable analysis.

Relapsed/Refractory

In the relapsed, refractory setting, MFC was used to evaluate the prognostic value of CTCs in 42 patients with refractory or relapsed MM, 92 % of whom were treated with bortezomib-based therapy [57]. Aberrant CPCs (aCPCs) were detected in 24/42 (57.1 %) of patients before treatment and in 22/42 (52 %) after one cycle of treatment. Failure to achieve a CPC reduction after one treatment course identified patients who were refractory to treatment and at risk for disease progression during the planned treatment period. Patients who did not have any detectable aCPCs in both pre- and post-chemotherapy samples had an improved median OS compared to those who had a decrease in aCPCs post-chemotherapy and those with no change or an increase in aCPCs post-chemotherapy (1006 versus 856 versus 308 days, respectively, p = 0.007).

Various Stages of Disease

Paiva et al. explored the phenotypic, cytogenetic, and functional characteristics of CTCs by MFC, FISH, cell-cycle analysis, and colony assays from MM patients by comparing them to patient-paired BM clonal PCs [58]. The results suggest that CTCs represent a unique subset of MM cells with clonogenic potential and a quiescent phenotype, which may potentially be driven to circulate by circadian rhythms in a similar pattern to that of CD34(+) cells [58]. These cells represent a unique subpopulation of all BM MM cells, characterized by a downregulation of integrins, adhesion, and activation molecules.

Vij et al. identified a deep-sequencing approach to detect and quantify myeloma cells in the PB and were able to detect a PC clone in 44/46 (96 %) patients using DNA or RNA from PBMCs and in 45/46 (98 %) of patients using serum/plasma because of optimized primer sets for amplification of the IGH and IGK loci. MRD was detected at levels <1 per million leukocytes [59]. Applicability, however, similar to other PCR-based platforms for MRD analysis is limited since the sequencing method utilizes a high disease load sample for initial identification of the myeloma clone. Correlation with clinical outcome and prognosis using novel PB sequencing has yet to be determined.

Whole Body PET-CT and MRI as a Measure of MRD

The application of novel imaging techniques such as PET-CT has the potential to add sensitivity for detection of both extramedullary (EMD) and medullary disease, to monitor for disease in oligo- or nonsecretory MM (present in 1–5 % of MM patients at diagnosis) and to provide prognostic information. Complete FDG suppression in focal lesions prior to ASCT in 239 patients treated with Total Therapy 3 correlated with better outcomes and was only opposed in multivariate analysis by GEP-defined high-risk status. Taken together, PET-CT and high-risk GEP accounted for approximately 50 % of survival variability [60]. The correlation between PET/CT negativity with superior 4-year OS and PFS was confirmed by another group [61].

Magnetic resonance imaging (MRI) permits the detection of diffuse and focal BM infiltration in the absence of osteopenia or focal osteolysis on standard metastatic bone surveys (MBS). In 611 patients with MM treated with tandem ASCT, focal lesions identified by MRI, but not by MBS, independently affected survival, and resolution of lesions on MRI conferred superior OS [62].

In cases of predominantly macrofocal bony disease, disease burden is easily appreciated by PET or MRI, even in the setting of CR by IMWG criteria [6063]. Although there is no literature reporting MFC negativity in the setting of PET or MRI positivity, it is conceivable that sampling from a single iliac crest that may be uninvolved in macrofocal bony disease could account for MRD negativity, despite a significant tumor load residing in other sites.

Conclusion

MRD assessment has gained great importance in the response evaluation of MM and will continue to inform our understanding of disease biology. Techniques such as MFC, PCR, and NGS, although not yet routinely available, have the potential to achieve a higher level of sensitivity, up to 1 in 10−6 cells with an improved quantifiable range and will enable analysis of genetic diversity. Table 1 summarizes the different techniques used to identify MRD and their clinical prognostic impact. It is noteworthy that the prognostic significance of MFC MRD was seen in the ASCT-eligible patients but not consistently in the nontransplant-eligible population.

Table 1 Comparison of MRD methodologies and correlation with clinical outcomes

A disadvantage of molecular-based MRD methodology compared to MFC is the necessity for initial clone identification at a time when disease burden is high (≥1 million plasma cells for PCR or 500 ng of DNA in triplicate and ≥7 million plasma cells or 7 μg of DNA for NGS), thereby limiting applicability to only evaluable patients [64, 65]. Although NGS of BM aspirate or PB does require patient-specific clones, universal primer sets are obtained to identify myeloma-specific clonotypes for each patient based on their high frequency in the sample. Thus, NGS can detect MRD despite clonal evolution. In contrast, PCR- and MFC-based technologies are hampered by the fact that the initial clone may change over time, thereby limiting their use in relapsed disease. At this time, the PCR-based methodology cited here is not commercially available for use, and the prognostic value of NGS of PB has not been clinically validated.

The utility of MRD, as elegantly modeled in chronic myelogenous leukemia (CML), lies in the fact that one predominant clone, measured by the BCR-ABL transcript, is slowly eliminated with treatment and reflects the decrease in disease burden. In CML, MRD monitoring at established timepoints correlates well with long-term outcomes [66]. However, MM is marked by clonogenic heterogeneity even from the premalignant states. No single genomic change is necessary or sufficient, which is a major distinction from CML. Figure 1 illustrates how we might use the CML paradigm in starting to approach the burden of disease in MM in a patient who is radiographically negative (given the limitations of MRD testing on BM aspirates in patients with macrofocal disease).

Fig. 1
figure 1

The paradigm of MRD in CML may be adapted in starting to approach the burden of disease in MM. With the caveat that the patient shows no sign of disease radiographically, disease burden decreases logarithmically over time with therapy, as measured by increasingly sensitive methods of MRD detection. For example, MRD-negativity as measured by NGS may constitute up to a 10-fold log reduction in disease burden compared to the disease burden or MRD as measured by PEP or IFE. The hope is that ultimately, with improvements in therapy and with standardization and harmonization of methods of MRD detection, the disease burden can be eliminated completely and cure may be attained. MRD as a measure of disease burden over time

It is essential, however, to develop a rational, cost-effective, and minimally invasive approach to MRD testing in MM. Given the limited availability and uncertain comparative benefits of the different tests (let alone the complete absence of prospective studies evaluating a change in therapy based on MRD), further studies are required before performing MRD assessment in routine practice.

Recommended Current Use of MRD Testing

In a clinical trial setting, we propose at the time of diagnosis and routine tests be performed on blood (SPEP/IFE/free light/heavy light) and urine (UPEP/IFE). A PB sample for NGS and diagnostic BM aspirate should also be sent to determine if patient-specific clones (preferably by NGS over PCR) can be identified. Once a given patient’s initially abnormal blood and urine results show normalization, i.e., attainment of a CR, then NGS in PB should be tested in parallel with a BM aspiration for response evaluation. MRD should be evaluated by MFC within hours of the marrow aspirate procedure given the lability of plasma cells ex vivo, which limits the sensitivity of MFC if not performed promptly. If the patient is evaluable for MRD by NGS (of PB or BM) or PCR (of BM), then this should be repeated. Once comparative data regarding the various MRD methodologies are available from large prospective clinical trials, hopefully the number of tests needed can be minimized—in particular, if NGS from PB is validated, the repeat BM aspirate could be avoided altogether.

If monitoring MRD by highly sensitive techniques such as MFC, PCR, NGS, and imaging is standardized and reliably predicts survival, then we can move closer to the ultimate goal of using MRD in MM to allow for personalized therapy based on depth of response and also determine the efficacy of novel agents sooner. Equally important, a biological understanding of the persistence of MRD (e.g., CD138-negative cells in the setting of bortezomib refractory disease) will be fundamental in moving from long-term remissions to an eventual cure.