Introduction

Lymphomas are solid tumors of the immune system, most commonly affecting the lymph nodes and spleen. They are usually classified into Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL). Accurate staging is very important in Hodgkin and non-Hodgkin lymphomas to create an effective management protocol. Bone marrow infiltration (BMI) is found in 5–15% of patients with Hodgkin’s lymphoma and 20–40% of patients with non-Hodgkin’s lymphoma. BMI itself indicates that the disease has reached stage IV [1,2,3,4]. There are multiple modalities that can be used for staging and assessment of BMI in lymphomas. Bone marrow biopsy (BMB) is commonly used and is also considered a gold standard test for this purpose [5]. BMB is an invasive test, and a negative biopsy does not rule out the possibility of infiltration at distant sites as bone marrow infiltration in Hodgkin’s lymphoma can be either focal or diffuse.

According to the National Comprehensive Cancer Network (NCCN) and the Society of Nuclear Medicine (SNM), fluorodeoxyglucose positron emission tomography with computed tomography (FDG-PET/CT) is essential for staging patients with Hodgkin’s lymphoma [6, 7]. FDG-PET/CT is noninvasive and can detect lesions in the entire body. It is already being used for staging and response assessment in lymphomas. There is a proposition for the use of imaging techniques for the detection of BMI, especially the FDG-PET scan which showed good concordance with the BMB results in the detection of BMI. MRI is also a sensitive test that is not being used routinely [8]. Cortes Romera et al. complemented the use of FDG-PET/CT to guide other sites of BMB depending on the uptake sites [9]. In a retrospective study by Weiler et al., it was found that by the use of FDG-PET, 36 patients could have been upstaged from stage 3 to 4 and 16 patients from stage 2 to 4 affecting their treatment protocol [10]. Gang Cheng et al. also found FDG-PET/CT to be superior to BMB for BMI detection in pediatric lymphoma patients [11]. Moulin-Romsee et al. reported that routine BMB is unnecessary when FDG-PET/CT is negative, and FDG-PET/CT can even detect BM lesions that are not seen on biopsy [12]. Ken Herrmann et al. found that the diagnostic performance of FDG-PET/MRI in lymphoma patients was similar to that of FDG-PET/CT [13].

Although BMB is an accurate and invasive test, it has the limitation of obtaining a random sample from the bone marrow, which may cause an increased rate of false negatives. While MRI and FDG-PET/CT are non-invasive, they afford the advantage of visualizing bilateral bone marrow. However, due to the scarcity of comparative data in terms of sensitivity and specificity between invasive and non-invasive methods, we decided to carry out a meta-analysis in order to know the role of these tests in detecting bone marrow infiltration and staging in patients of HL and NHL.

Methodology

Search Strategy

We performed a meta-analysis according to the preferred reporting for systematic reviews and meta-analyses (PRISMA) guidelines [14]. A systematic search was carried out in PubMed for published articles between 2000 and 2022 with the following keywords: (“MRI” OR “magnetic resonance imaging”) AND (“bone marrow involvement” OR “bone marrow infiltration” OR “marrow” OR “infiltration involvement” OR “biopsy” OR “bone marrow biopsy” OR “BMB” OR “histopathological”) AND (“FDG” OR “fluorodeoxyglucose” OR “PET” OR “positron emission tomography”) AND (“Hodgkin” OR “Hodgkin lymphoma” OR “non-Hodgkin lymphoma” OR “Hodgkin disease” OR “lymphoma”). After the removal of duplicated, irrelevant studies, non-English texts, as well as those articles that failed to provide any quantitative data, each potential included article was examined by two investigators independently to know if the selected studies fit the predetermined inclusion criteria. Any disagreement between the investigators was settled through discussion focusing on the inclusion and exclusion criteria.

Study Selection

All those published articles were selected if they showed a diagnostic accuracy of MRI or FDG-PET/CT for lymphomas with a reference test as biopsy or histopathological confirmation. For those articles that reported both HL and NHL, data was entered for each condition separately, and combined results were also entered. Only articles available either in English or translated into English were selected. We excluded all those articles that did not have qualitative data presented in terms of numbers or percentages, were case reports, had no data on diagnostic accuracy in terms of true positives, true negatives, false positives, or false negatives, or did not use a biopsy as a reference test for comparison. For articles reporting DWI-MRI results, only those articles were selected for analysis that had data on all the sequences of MRI, i.e., T1-weighted imaging, T2-weighted imaging, STIR-MRI sequence, and DWI-MRI sequence.

From the included studies, data regarding the patient’s characteristics, the outcome of interest, the reference test used, and the new comparison test were collected. The study characteristics, like study design, publication year, the country in which it was conducted, sample size, mean/median age, gender, and data regarding diagnostic parameters, were extracted.

Endpoints

Diagnostic accuracy was calculated as sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio for comparing MRI or FDG-PET/CT in patients diagnosed with lymphoma using a reference test as either a biopsy or any form of histopathological confirmation.

Quality Assessment

To assess the methodological quality of the included studies, the risk of bias and applicability concerns, the revised quality assessment of studies of diagnostic accuracy included in systematic reviews (QUADAS)-2 tool was used. It is a tool that contains domains in the form of patient selection, index tests used, reference standards used, and study flow and timing. All selected studies were analyzed categorically in the following domains: patient selection (Hodgkin lymphoma and non-Hodgkin lymphoma), index test (FDG-PET, combined FDG-PET/CT, and MRI), reference standard (histopathological confirmation or bone marrow biopsy), and flow and timing between the reference standard and index test. The risk of bias was assessed in all the domains, and concerns for applicability were assessed in the first three domains with signaling questions, which were answered with “YES” for low risk of bias and applicability concern, “NO” for high risk of bias and applicability concern, or “UNCLEAR” if insufficient and inconclusive data was provided [15].

Statistical Analysis

The included studies compared either MRI or FDG-PET/CT for lymphoma diagnosis or staging with a reference biopsy test. The analysis was performed at the study level with OpenMeta Analyst software version 5.26.14 [16]. p < 0.05 was considered statistically significant. We calculated pooled sensitivity, specificity, positive likelihood ratio (LR), negative likelihood ratio (LR), diagnostic odds ratio (DOR) with a 95% confidence interval (CI), and summary receiver operating characteristic curves (SROC). We analyzed the heterogeneity through the I2 statistics, values between 0 and 25% were considered to be insignificant, 25–50% were considered to have a low level of heterogeneity, 50–75% were considered to have a moderate level, and 75–100% indicated a high level of heterogeneity. To assess the publication-related bias, funnel plots were used. Subgroup analysis was also performed by grouping only HL, NHL, and all lymphomas together.

Results

Baseline Characteristics

Initially, 387 articles were obtained by searching the database, which underwent primary screening. Three hundred thirty-nine articles were excluded because they were case reports, because of their titles, or because they did not match the inclusion criteria. Later, 48 studies underwent a full review as a secondary screening. Sixteen were excluded as they did not have sufficient data, either in terms of the diagnostic test performed or how it was represented graphically or qualitatively; eight were excluded as they did not use biopsy as a reference test in that study; and one was excluded as it was in the German language. Finally, we included a total of 24 studies with a total of 2969 patients in our analysis, which met our inclusion criteria, as shown in Fig. 1. The baseline features of the included studies are shown in Table 1.

Fig. 1
figure 1

PRISMA flow diagram of the included studies showing the detailed view of strategy used to evaluate the studies for meta-analysis. In the end, 24 studies were included for analysis

Table 1 Baseline characteristics of the included studies

Diagnostic Performance of FDG-PET/CT

Twenty-one studies compared FDG-PET/CT as an index test with histopathological analysis as the reference standard test in HL and NHL patients. FDG-PET/CT demonstrated an overall pooled sensitivity of 0.771 (95% CI = 0.652–0.858) and specificity of 0.897 (95% CI = 0.859–0.926), as shown in Figs. 2 and 3, respectively where the pooled sensitivity and specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot. The positive LR of 6.716 (95% CI = 4.983–9.051), negative LR of 0.187 (0.120–0.291), and a diagnostic OR of 39.045 (95% CI = 18.109–84.186) for detecting lymphoma in the bone marrow. The p-value for all the pooled tests was found to be < 0.001.

Fig. 2
figure 2

Forest plot showing the pooled sensitivity of FDG-PET/CT for the detectability of bone marrow involvement in patients with lymphoma. Note: the pooled sensitivity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Fig. 3
figure 3

Forest plot showing the pooled specificity of FDG-PET/CT for the detectability of bone marrow involvement in patients with lymphoma. Note: the pooled specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Diagnostic Performance of MRI

Ten studies compared MRI as an index test with histopathological analysis as the reference standard test in HL and NHL patients. MRI demonstrated an overall pooled sensitivity of 0.778 (95% CI = 0.637–0.875) and specificity of 0.886 (95% CI = 0.793–0.940), as shown in Figs. 4 and 5, respectively. The positive LR of 6.969 (95% CI = 3.406–14.257), negative LR of 0.191 (95% CI = 0.081–0.451), and diagnostic OR of 39.184 (95% CI = 9.102–168.688) for the detecting lymphoma in the bone marrow. The p-value for all the pooled tests was found to be < 0.001.

Fig. 4
figure 4

Forest plot showing the pooled sensitivity of MRI for the detectability of bone marrow involvement in patients with lymphoma. Note: the pooled sensitivity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Fig. 5
figure 5

Forest plot showing the pooled specificity of MRI for the detectability of bone marrow involvement in patients with lymphoma. Note: the pooled specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Diagnostic Performance of FDG-PET/CT and MRI in Patients with HL

When subgrouped according to the specific disease, 13 studies used FDG-PET/CT as an index test, and 3 studies used MRI as an index test for detecting bone marrow involvement in HL patients, with histopathological analysis being the reference test. FDG-PET/CT demonstrated an overall pooled sensitivity of 0.903 (95% CI = 0.847–0.940) and specificity of 0.878 (95% CI = 0.820–0.919), as shown in Figs. 6 and 7, respectively. The positive LR of 7.106 (95% CI = 4.787–10.548), negative LR of 0.102 (95% CI = 0.066–0.156), and diagnostic OR of 111.068 (95% CI = 36.995–333.450) for the detecting HL in bone marrow with a p value < 0.001.

Fig. 6
figure 6

Forest plot showing the pooled sensitivity of FDG-PET/CT for the detectability of bone marrow involvement in patients with Hodgkin lymphoma. Note: the pooled sensitivity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Fig. 7
figure 7

Forest plot showing the pooled specificity of FDG-PET/CT for the detectability of bone marrow involvement in patients with Hodgkin lymphoma. Note: the pooled specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

MRI demonstrated an overall pooled sensitivity of 0.846, 95% CI = 0.468–0.972, p value = 0.068; and specificity of 0.806, 95% CI = 0.472–0.951, p value = 0.069, as shown in Fig. 8.

Fig. 8
figure 8

Forest plot showing the pooled sensitivity and specificity of MRI for the detectability of bone marrow involvement in patients with Hodgkin lymphoma. Note: the pooled sensitivity and specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Diagnostic Performance of FDG-PET/CT and MRI in Patients with NHL

When subgrouped according to the specific disease, NHL, 6 studies used FDG-PET/CT as an index test, and 5 studies used MRI as an index test for detecting bone marrow involvement in NHL with histopathological analysis as the reference test. As shown in Figs. 9 and 10, FDG-PET/CT demonstrated an overall pooled sensitivity of 0.569 (95% CI = 0.309–0.796) (p value = 0.841) and specificity of 0.866 (95% CI = 0.687–0.950) (p value = 0.003). The positive LR of 3.930 (95% CI = 1.439–10.730) (p value = 0.027), a negative LR of 0.398 (95% CI = 0.189–0.838) (p value = 0.030), and a diagnostic OR of 9.546 (95% CI = 1.942–46.916) (p value = 0.024) for detecting NHL in bone marrow.

Fig. 9
figure 9

Forest plot showing the pooled sensitivity of FDG-PET/CT for the detectability of bone marrow involvement in patients with Non-Hodgkin lymphoma. Note: the pooled sensitivity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Fig. 10
figure 10

Forest plot showing the pooled specificity of FDG-PET/CT for the detectability of bone marrow involvement in patients with Non-Hodgkin lymphoma. Note: the pooled specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

MRI demonstrated an overall pooled sensitivity of 0.777 (95% CI = 0.469–0.932) and specificity of 0.795 (95% CI = 0.599–0.910), as shown in Figs. 11 and 12, respectively. The positive LR of 3.747 (95% CI = 1.414–9.933), negative LR of 0.192 (95% CI = 0.047–0.788), and diagnostic OR of 20.007 (95% CI = 1.390–288.023) for the detecting HL in bone marrow.

Fig. 11
figure 11

Forest plot showing the pooled sensitivity of MRI for the detectability of bone marrow involvement in patients with Non-Hodgkin lymphoma. Note: the pooled sensitivity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Fig. 12
figure 12

Forest plot showing the pooled specificity of FDG-PET/CT for the detectability of bone marrow involvement in patients with Non-Hodgkin lymphoma. Note: the pooled specificity at a 95% confidence interval are shown by the diamonds at the bottom of the center vertical line in the forest plot

Discussion

The current method for initial staging and evaluating treatment response for lymphomas consists of a computed tomography (CT) scan and a bone marrow histopathologic examination. FDG-PET/CT is an imaging modality that has high sensitivity and specificity for HL or aggressive NHL. MRI is also gaining momentum as a suitable non-invasive test for staging lymphomas.

Comparing FDG PET/CT with Bone Marrow Biopsy in HL and NHL Patients

FDG-PET/CT is sensitive in lymph node staging and determining organ involvement [37]. Ngeow et al., in their study, compared FDG-PET/CT findings with BMB findings in 122 lymphoma patients, with results showing that FDG-PET/CT and marrow results concurred in 107 (88%) cases, with 100 being concordantly negative and seven being concordantly positive. They estimated the sensitivity of FDG-PET/CT to be 41% and specificity of 95% with an NPV of 91% and PPV of 58% [32]. In our study, after analysis of the data from 20 studies that compared FDG-PET/CT with BMB, FDG-PET/CT displayed an overall pooled sensitivity of 78% and specificity of 88% for detecting bone marrow involvement in all lymphoma patients.

Hodgkin’s Lymphoma

In a study by El Galaly et al. on 454 HL patients, it was found that positive and negative BMBs were detected by skeletal FDG-PET/CT lesions with a sensitivity and specificity of 85% and 86%, respectively. The PPV and NPV values were 28% and 99%, respectively. They also discovered that not using BMB would not affect the management protocol or risk assessment in these patients [38]. We found that in HL patients, FDG-PET/CT showed an overall pooled sensitivity of 90% and specificity of 88% in detecting bone marrow involvement. A study by Purz et al. in pediatric lymphoma patients did not find any false negative PET lesions and calculated the sensitivity and NPV of FDG-PET/CT to be 100% while using a combination of BMB and CT as references [34].

Non-Hodgkin’s Lymphoma

In their study of follicular lymphoma patients, Wohrer et al. (2006) discovered that FDG-PET detected marrow infiltration in 54% of patients with a positive bone marrow biopsy [39]. Elstrom et al. (2003) compared FDG-PET results with iliac crest biopsy results for the detection of bone marrow infiltration and found suboptimal results for all types of lymphomas [40]. Also, Xiao-Xue et al. (2020) found the sensitivity, specificity, and accuracy of FDG-PET/CT in detecting marrow infiltration of diffuse large B-cell lymphoma (DLBCL) to be 80%, 90%, and 88.1%, respectively. They also found the sensitivity, specificity, and accuracy in follicular lymphoma (FL) patients to be 37.5%, 87.5%, and 70.8%, respectively [41]. Our analysis showed that FDG-PET/CT demonstrated an overall pooled sensitivity and specificity of 50% and 78.1%, respectively, in NHL patients.

Comparing MRI with BMB in Lymphoma Patients

A study by Wu et al. (2012) calculated the overall pooled sensitivity and specificity of FDG-PET/CT to be 91.6% and 90.3%, respectively, with a diagnostic odds ratio (DOR) of 68.89, while the sensitivity and specificity of MRI were found to be 90.3% and 75.9%, respectively, with a diagnostic odds ratio (DOR) of 26.94. The DOR indicates how much more likely it is that a person with a positive test result will develop the disease than a person with a negative test result.

Therefore, they concluded that FDG-PET/CT was more diagnostically accurate than MRI in the detection of BMI [42]. Another study by Jiang et al. in malignant lymphoma patients found the pooled sensitivity and specificity of MRI to be 82% and 79%, respectively, with a positive likelihood ratio (LR) of 3.9 and a negative likelihood ratio (LR) of 0.9 [43]. Adams et al. found the sensitivity of whole-body MRI to be 45.5%, but on subgroup analysis, they found that the sensitivity in aggressive lymphoma patients was significantly higher (88.9%) (p = 0.0029) than that in indolent lymphoma patients (23.5%) [8]. In our study, MRI demonstrated an overall pooled sensitivity of 77.8% and specificity of 88.6% with a positive LR of 6.9, a negative LR of 0.19, and a diagnostic odds ratio (DOR) of 39.18 in both HL and NHL patients. Further analysis showed that in HL patients, MRI demonstrated an overall pooled sensitivity and specificity of 84.6% and 80.6%, respectively, while in NHL patients, it demonstrated an overall pooled sensitivity and specificity of 77.7% and 79.5%.

An advantage of MRI is that the morphological and functional parameters obtained with MRI, particularly with DWI, do not appear to be influenced by the histology but only by the cellularity, in contrast to the avidity for FDG, which differs significantly depending on the histological grade and the aggressiveness of the lymphoma [22].

Substantial research in developing novel diagnostic tools and updating staging guidelines is underway. In the coming years, we may see the advent of even newer molecular and radiological technologies as a constitutive part of clinical trials. This will further simplify the diagnosis, staging, and prognosis of lymphomas.

One of the limitations of our study was that FDG-PET/CT could falsely interpret the diffuse uptake in bone marrow due to an underlying systemic disease or even due to an inflammatory reaction caused by the administration of hematopoietic growth factors, compared to bone marrow biopsy. Similarly, FDG-PET/CT could have difficulty in the initial staging of HL, as it can falsely label myeloid hyperplasia as bone marrow infiltration, which is seen in some HLs. Another potential limitation of using BMB as a reference test is that a unilateral negatively labeled BMB does not always exclude BM infiltration, resulting in discordant values in false positives and true negatives [22]. While we did obtain > 75% heterogeneity in few analyses, we failed to address heterogeneity for each included study.

Conclusion

Based on the results of this meta-analysis, we found that non-invasive techniques, such as FDG-PET/CT and MRI, are more helpful in lymphoma staging than invasive procedures, such as bone marrow biopsy. The non-invasive test showed almost equal and high sensitivity and specificity in detecting BMI in all types of lymphomas when compared with bone marrow biopsy. FDG-PET/CT and MRI were found to be highly accurate in detecting BMI, more so in HL than in NHL patients. This analysis suggests that these non-invasive techniques may provide essential diagnostic data, possibly resulting in more individualized treatment plans and better patient results. Focusing on the accuracy of lymphoma staging by including FDG-PET/CT and MRI in the diagnosis procedure can help in clinical decision-making. The necessity for invasive treatments like bone marrow biopsy, which might be linked to increased patient discomfort and procedural hazards, might be lessened as a result of better-informed treatment decisions. As a result, the use of non-invasive methods in ordinary clinical practice may result in more effective and patient-friendly methods for the diagnosis and management of lymphoma. Hence, FDG-PET/CT and MRI play an important role in accurately staging lymphomas.