Introduction

Iron deposition in the liver can be secondary to serial transfusions or hereditary processes that result in abnormally increased red cell turnover or abnormal GI tract absorption of iron [1]. Iron overload in the liver has been associated with liver injury and the developing of other chronic conditions such as liver fibrosis, cirrhosis, and hepatocellular carcinoma (HCC) [2]. The measurement of liver iron concentration (LIC) has traditionally relied on liver biopsy [3, 4]. However, considering liver biopsy is an invasive procedure, it has limitations associated with high sample variability, technique-sampling errors, moderate inter-reader agreement, low acceptance from the patient, and is considered as unsuitable to be used in a consecutive routine follow-up assessment [3, 5,6,7,8,9,10,11]. Moreover, the rate of hospitalization related to liver biopsy and risk of severe complications, have been reported between 1% and 5%, respectively with reported mortality rates ranging from 1:1000 to 1:10,000 [12,13,14,15].

Magnetic resonance image (MRI) has been shown to accurately quantify liver iron content without the need for invasive biopsy, thereby allowing serial assessment that can guide iron-reducing therapies [16]. MRI has largely taken over as the diagnostic modality of choice for the identification and quantitation of LIC [17].

MRI relaxation time techniques to estimate liver iron concentration have been developed and studied for several years [18,19,20,21,22,23,24,25,26,27,28]. Iron shortens T1, T2, and T2* relaxation times measured by MRI, darkening the images [29]. The reciprocals of relaxation times, T2 and T2*, known as relaxivity, R2 and R2*, are directly proportional to iron and are commonly used [30,31,32].

The most widely adopted FDA-approved method is based on the measurement of tissue proton transverse relaxation rates (R2) and show excellent correlation with LIC measured by biopsy [33, 34]. Most investigators have described a linear rise in R2 with iron [18,19,20,21,22,23,24,25,26,27,28]; however, these studies have been criticized for their small size, limited dynamic range, and inter-study calibration variability [18, 19].

R2-MRI (FerriScan®, Resonance Health, Australia) is currently FDA-approved for estimating LIC [35]. Studies have reported sensitivity and specificity up to an LIC of 15 mg/g dry weight of 85% and 92% respectively [17]. However, to the best of our knowledge, the technical details by which FerriScan® performs their analysis have not yet been reproduced using a method on a large pediatric dataset. Our objective in this study was to assess the reproducibility and accuracy of R2-relaxometry MRI for estimation of the LIC between non-proprietary in-house analysis and a FDA-approved commercially available third party derived results in children.

Methods

Subjects

Our Institutional Review Board approved this Health Insurance Portability and Accountability Act (HIPPA) compliant retrospective study. All patients aged 0–18 years with available liver MR indicated for iron quantification using R2 sequences between February 2008 and June 2018 were initially identified. Patients with incomplete imaging data or incomplete demographics in the electronic medical record were excluded. Demographic and anthropometric measures at the time of the exam were retrieved from the electronic medical records. While the MR scanner software was updated in 2011, and in 2016, there were no changes in the liver iron quantification protocol.

MR imaging

All MRI studies were performed using a 1.5T scanner (Avanto, Siemens Healthineers, USA). Multi-echo spin-echo axial scans with a fixed repetition time (TR) of 1000 ms (msec), matrix size of 256 with typical fields of view between 300 and 400 mm (exact dimensions depending on subject size), slice thickness of 5 mm, and increasing echo time (TE) values of 6 ms, 9 ms, 12 ms, 15 ms, and 18 ms (spaced at 3 ms intervals) were acquired based on methodology previously reported by St. Pierre et al. [25, 36]. No fat suppression was used (Table 1). All commercial MR scanners can accommodate the parameters needed to acquire the images. As required in the FerriScan® protocol guidelines, a 1-l saline bag was placed along the left side of the torso as an external reference for correction of signal gain changes. The spin-echo image acquisitions were performed free-breathing without respiratory compensation techniques. Small amounts of respiratory motion artifacts from free-breathing acquisition technique are expected and acceptable. However, if severe motion artifacts are observed on the acquired images, scans need to be repeated or performed under light sedation to improve image quality.

Table 1 Image acquisition protocol

Imaging post-processing

For analysis by an external company (FerriScan®, Resonance Health, Australia), images were securely transmitted offsite to Resonance Health for post-processing and analysis using their proprietary method, the general principles of which have been previously described [17, 25, 37]. The transverse relaxivity, R2, is calculated by exponential curve fitting to the image signal intensities obtained by the individual TE images [38]. The mean R2 value within the liver is then used to yield an estimated LIC using a calibration curve that had been determined through the measurement of liver R2 and needle biopsy LIC [25]. For the images sent offsite, image analysis for LIC values estimated by the R2 method were performed by Resonance Health for a fee per study and a report was sent back to our facility, within 48–72 h. Using the report sent back by Resonance Health, the mean R2, standard deviation, and corresponding estimated LIC were extracted for comparison and analysis.

For in-house analysis, post-processing was performed using parametric MRI software (www.parametricMRI.com, Philadelphia, PA, USA) (Fig. 1). We used the individual TE images to calculate R2 maps and by drawing regions of interest (ROI) on the mid-liver slice of the generated R2 map, a mean R2 value was calculated. One ROI was placed in a transverse mid-slice section encompassing the entire visible liver boundary and avoiding any major vessels and bile ducts. Two medical doctors under the guidance of a PhD physicist with 15+ years of experience of R2 measurement performed the analysis. Studies were mutually, randomly and equally split between the two analysts. Mono-exponential T2 relaxation times were calculated for the ROIs. The mono-exponential T2 relaxation was calculated by fitting the signal intensity of each pixel using the equation:

$$S\left( {\text{TE}} \right) = S0_{\text{mono}} \,{ \exp }\left( { - \frac{\text{TE}}{{{\text{T2}}_{\text{mono}} }}} \right) + C$$

where constant, C, was considered necessary to compensate for contributions from instrument noise and effects from iron-poor species such as blood and bile [34]. The calculated T2 relaxation time in ms was converted to relaxivity, R2, in s−1 by R2 = 1000/T2. R2 was then converted to estimated LIC as previously described [34];

$${\text{LIC}} \left( {\frac{{\text{mg}}}{{\text{gm}}}}\right) = {\left( {29.75 - \sqrt {900.7 - 2.283R2} } \right)^{1.424}}$$
Fig. 1
figure 1

Flowchart describing the image analysis steps

T2, R2, and estimated LIC maps were generated (Fig. 2). LIC reported by FerriScan® was also compared with the LIC estimated in-house using R2 as reported by FerriScan®. The reason to do this was to estimate if the difference in values were truly due to variation in ROI selection.

Fig. 2
figure 2

Representative T2 relaxation, R2 relaxivity, and estimated LIC map

Statistical analysis

The relationship between liver R2 values and LIC estimated in-house and values calculated by FerriScan® were assessed with correlation coefficients and Bland–Altman difference plots. Statistical analyses were performed with MedCalc version 18.11 (Medcalc, Ostend, Belgium). Continuous variables were presented as mean ± standard deviation (SD). Significance was set at p value < 0.05. Linear regression was used to compare the proprietary and non-proprietary values. Systematic differences between methods were tested with the Bland–Altman analysis, with limits of agreement as mean ± 1.96 multiplied by the standard deviation. Defined outliers are studies that are more than two standard deviations from the mean and discontinuous from the sample. Correlation coefficients were classified using the following definitions: less than 0.39, weak; 0.40–0.59, moderate; 0.60–0.79, strong; 0.80–0.99, excellent [39].

Results

A total of 474 studies (254 studies from girls and 220 studies from boys) from 175 patients were included in the study. 24 studies were excluded because of breathing motion artifacts (n = 6), incomplete scans (n = 7), and failed post-processing due to insufficient images (n = 11). The mean age of the patients was 10.4 ± 4.2 years (range 1–18 years). The LIC ranged from 0.6 to 43.0 mg/g dry liver, covering a broad range from close to normal levels to extremely high iron levels. The etiology of iron overload (number of studies) in the patient population was as follows: Beta thalassemia (n = 207), Sickle cell disease (n = 138), Sideroblastic anemia (n = 20), Pyruvate kinase deficiency (n = 17), Diamond-Blackfan anemia (n = 14), Hemoglobin E disease (n = 13), Hemoglobin H disease (n = 10), aplastic anemia (n = 8), Hereditary hemolytic anemia (n = 4), Idiopathic acquired aplastic anemia (n = 6), Non-autoimmune hemolytic anemia (n = 4), Acute lymphoblastic leukemia (n = 5), Glucose-6-phosphate dehydrogenase deficiency (n = 4), Hemochromatosis (n = 3), Alpha thalassemia (n = 2), Aplastic anemia (n = 2), Autoimmune hemolytic anemia (n = 2), Wilms tumor(n = 2), 1 study each with SAMD9 gene mutation, Langerhan’s cell histiocytosis, Hemolytic anemia of unknown origin, Erwings sarcoma, and Chronic Dyserythropoietic Anemia, and 8 studies for other oncology follow-up reasons.

Linearity between proprietary and non-proprietary in-house method for R2 was excellent across the observed range for R2 (31.5 to 334.8 s−1). We obtained a correlation coefficient r = 0.87, p < 0.001 for R2 (Table 2; Fig. 3). A Bland–Altman R2 difference plot between the two methods shows a mean bias of + 21.5 s−1 (range − 47.0 to + 90.0 s−1) between two standard deviations (Fig. 4). Similarly, linearity between proprietary and non-proprietary in-house methods for estimated LIC was also observed to be excellent across the observed range for LIC (0.6 to 43 mg/gm). We obtained a correlation coefficient r = 0.84, p < 0.001 for LIC (Fig. 5). The Bland–Altman LIC difference plot between the two methods show a mean bias of + 2.9 mg/gm (range − 8.9 to + 14.7 mg/gm) between two standard deviations (Fig. 6). Representative images of comparison between the two methods performed on two representative subjects, one with a history of Beta thalassemia (Fig. 7) and the other with Diamond-Blackfan anemia (Fig. 8) are shown. Excellent agreement was observed on LIC reported by FerriScan® compared with LIC estimated in-house using R2 reported by FerriScan®, r = 1.0, p < 0.001 (Fig. 9).

Table 2 R2 and LIC comparison between methods
Fig. 3
figure 3

Scatterplot of R2 (s−1) showing a strong linearity of non-proprietary in-house analysis against proprietary methods

Fig. 4
figure 4

Bland–Altman analysis plot of R2 (s−1) between the two methods

Fig. 5
figure 5

Scatterplot of estimated LIC (mg/gm) showing a strong linearity between the non-proprietary in-house analysis and proprietary method

Fig. 6
figure 6

Bland–Altman analysis plot of estimated LIC (mg/gm) between the two methods

Fig. 7
figure 7figure 7

Representative images from a 4-year-old boy with a history of Beta thalassemia. Report by Ferriscan® shows a mean R2 of 91.5 ± 14.2 s−1 corresponding to an estimated LIC of 5.4 mg/g. Performing analysis in-house, we measured a mean T2 of 8.9 ms; translating to a R2 of 98.5 ± 5.2 s−1 and an estimated LIC of 6.5 mg/g. a Signal intensity vs echo time with varying degrees of liver signal are shown. b Proprietary versus generated in-house by non-proprietary analysis R2 map; c generated in-house by non-proprietary analysis, its corresponding T2 map and d LIC map

Fig. 8
figure 8figure 8

Representative images from a 5-year-old boy with history of Diamond-Blackfan anemia. Report by Ferriscan® shows a mean R2 of 241 ± 27.9 s−1 corresponding to an estimated LIC of 30.1 mg/g. Performing analysis in-house, we measured a mean T2 of 4.7 ms; translating to a R2 of 280.8 ± 30.4 s−1 and an estimated LIC of 41.1 mg/g. a Signal intensity vs echo time with varying degrees of liver signal are shown. b Proprietary vs generated in-house by non-proprietary analysis R2 map; c generated in-house by non-proprietary analysis, its corresponding T2 map and d LIC map

Fig. 9
figure 9

LIC reported by FerriScan® compared with LIC estimated in-house using R2 reported by FerriScan® agreed strongly, r = 1.0, p < 0.001. Scatterplot of LIC (mg/g) showing a strong linearity of non-proprietary against proprietary values

Discussion

Non-invasive assessment of LIC is universally accepted as an alternative to biopsy and serves as a reliable technique for clinicians to assess evidence-based management of iron overload. In particular, R2 based analysis is currently only available through a commercial off-shore data processing center using a proprietary method (Ferriscan®, Resonance Health, Australia), limiting its clinical and research applications. We investigated the inter-method reproducibility between proprietary and in-house curve fitting methods. We found that values for R2 relaxometry MRI for LIC estimation is reproducible between FDA-approved commercially available and our in-house analysis methods, and it shows excellent agreement. To the best of our knowledge, there are no previous studies that have reported a comparison between two different methods to determine LIC in a pediatric population with a wide range of iron content in a large number of studies.

Our results for R2 values show strong agreement between proprietary and in-house developed non-proprietary methods (r = 0.86, p < 0.001). The Bland–Altman difference plot between the two methods shows a R2 mean bias of + 21.5 s−1 or a LIC mean bias of + 3.0 mg/gm. This in our opinion is unlikely to be clinically meaningful. Our results are within range of a previously reported inter-method agreement of ± 9% performed on a smaller sample of 40 studies in adults [40]. Our results also show that the difference in the values of LIC estimated by the two methods is likely only due to user-dependent ROIs and not in the conversion from R2 to LIC. Using R2 values reported by Ferriscan®, and estimating LIC in-house using the previously reported formula [30, 34], we observe excellent agreement in estimated LIC (r = 1.0, p < 0.001) with the Bland–Altman difference plot between the comparison showing a mean bias of only − 0.37 mg/g (range − 0.74 to + 0.01 mg/g). The purpose of our study was to compare MRI derived R2 and its estimated LIC between two methods using formulae in scientific published literature. We did not intend to or attempt to reverse engineer proprietary analysis methods. Availability of in-house analysis can facilitate further development and research of acquisition and analysis methods by independent academic investigators. Potential research areas may include inclusion of fat suppression, motion suppression techniques such as non-cartesian k-space acquisitions, and acquisition speed improvement such as echo train acquisitions and compressed sensing. Advanced analysis methods may include development of machine learning and deep learning algorithms.

One limitation of this study is that all available R2 MR images for LIC assessment were acquired at a single institution. Future work will require that we correlate our results with multi-centered studies. Second limitation is that although the images are acquired in free-breathing and small amounts of motion artifacts are expected, for severe motion, the images are repeated or performed under light sedation. Within the scope of this work and due to the retrospective nature of the study, we did not track the need to repeat scans or sedation records. Finally, the LIC range of the population included in our study was from 0.6 mg/g to upper range of estimated LIC of 43 mg/g is due to the inherent limitation of the current MRI protocol used for the R2 mapping method (Fig. 10). If needed, UTE based acquisition protocol may be used to increase the dynamic range.

Fig. 10
figure 10

Representative estimated LIC maps from 4 separate patients ranging from normal level to extremely high iron level superimposed on 1st echo anatomical image

Conclusion

In conclusion, R2 based relaxometry MR imaging for liver iron concentration estimation is a robust and reliable method that is reproducible between FDA-approved commercially available and our in-house analysis methods.