Introduction

Background

Magnetic Resonance Imaging (MRI) is an invaluable diagnostic tool that complements ultrasound in obstetric settings by providing non-invasive, high-contrast, multi-planar, 3-dimensional imaging of fetal anatomy [1, 2]. The primary disadvantage of fetal MRI is that its image quality is easily degraded by Motion Artifacts (MAs) which primarily result from spontaneous gross fetal movement during image acquisition, and to a lesser extent from maternal motion, including breathing and bowel motion [1, 3,4,5,6,7,8]. While shortening the acquisition time can partially address this challenge, this approach at the same time restricts the spatial resolution and image contrast, leading to a lower image quality [8]. An alternative approach has been to develop MR pulse sequences that account for fetal motion and to remove motion effects during image post-processing (for example: [9,10,11]).

Development and testing of these pulse sequences necessitate the use of motion phantoms [12, 13]. However, these phantoms commonly have a very rudimentary design that does not simulate the complex and dynamic anatomy of the gravid uterus [14], which is often accompanied by unique tissue-related artifacts [5]. While the use of pregnant animals could address this need, the repeated and extended sequence testing on animals is time-consuming, expensive, and necessarily inaccurate because of the required use of sedation which inhibits fetal motion and is not representative of the human scenario [12]. Use of a single MRI phantom that is anthropomorphic but that also simulates fetal motion would permit faster and more accurate testing with reproducible and accurate images [12, 15]. Thus, a fetal Anthropomorphic MR Motion Phantom (AMP) could shorten the sequence development time, accelerate the clinical translation of new sequences, and enhance the diagnostic quality of fetal MR imaging.

While maternal and fetal motion can be controlled to some degree with breath holds or sedation, even minor spontaneous gross fetal movements as well as fetal cardiovascular and respiratory motion can significantly degrade MR image quality [1, 16,17,18,19]. Fetal cardiovascular movement begins at the end of 4 weeks of gestation and increases from 110 Beats Per Minute (BPM) to 170 BPM at 9 weeks [20]. The heart rate then gradually decreases until term, when the mean heart rate is 135 BPM [21]. Fetal breathing motion is a complex phenomenon consisting of rhythmic movement of the fetal chest and abdominal wall [22]. Mean Gestational Age (GA) at the onset of abdominal movement is at 18 ± 2 weeks, while the onset of chest movement is at 21 ± 3 weeks [23]. The average fetal respiratory rate is 44 breaths/min at 24–28 weeks and 43 breaths/min at 28–39 weeks [22, 24]. The onset of gross fetal motion, characterized by trunk movement, occurs at 7.5–8.5 weeks and continues until delivery [25, 26]. The mean number of gross fetal body movements at 22–34 weeks is 60 movements/h, which then gradually decreases to an average of 40 movements/h from 35 weeks to term [27, 28]. The gradual decrease in movement frequency as the fetus grows is generally attributed to a decrease in intrauterine space as well as the establishment of behaviour states after 30 weeks [29].

The purpose of this study was to identify and quantitatively analyze all fetal MRI phantoms that have been published to date. A narrative review to identify published phantoms was followed by a quantitative analysis of each phantom’s accuracy based on chosen evaluation criteria. Then, a secondary narrative review was conducted to draw conclusions from the data and provide recommendations on how to construct an ideal fetal AMP. We aimed to provide a valuable reference for researchers who plan to build or use a fetal MRI phantom and will demonstrate the importance of integrating AMPs in fetal MR studies.

Methods

Data source and search strategy for phantom identification

A literature search was conducted using PubMed, Google Scholar, and Ryerson University Library (RUL) databases to identify all fetal MRI phantoms published prior to April 6th, 2019. In an initial search on PubMed and RUL databases, the search term ‘fetal MRI phantom’ was used. This was followed by a secondary search using the following keywords: (‘fetus’ or ‘fetal’) AND (‘MRI’ or ‘Magnetic Resonance Imaging’) AND (‘phantom’ or ‘anthropomorphic phantom’ or ‘model’ or ‘in vitro model’ or ‘physical model’) AND (‘pregnant woman’ or ‘gravid abdomen’ or ‘pregnant abdomen’ or ‘heart’ or ‘lungs’ or ‘brain’ or ‘liver’ or ‘placenta’ or ‘placental’ or ‘amniotic fluid’ or ‘artificial amniotic fluid’ or ‘gravid uterus’) NOT (‘mouse’ or ‘sheep’ or ‘rat’ or ‘rabbit’ or ‘primate’ or ‘porcine’). For every search conducted on the RUL database, the option to ‘add results beyond library collection’ was used to widen the search. Both primary and secondary searches examined only articles with these keywords in the abstract to ensure that only relevant articles were reviewed.

The searched keywords in Google Scholar had to be modified because this search engine displays articles that are in a similar research area to the keywords used, regardless if they contain them in their full-text. The searches that were conducted on Google Scholar and the respective number of article titles that were reviewed were as follows: ‘fetal MRI phantom’ (first 50 articles), ‘anthropomorphic fetal MRI phantom’ (first 50 articles), and ‘MRI phantom of fetus’ (first 100 articles). The results were further limited to articles published between 2005 and 2019 primarily due to the unlikelihood of fetal MRI phantom development prior to 2005.

Inclusion and exclusion criteria

Articles were included in the narrative review only if they reported on the use or development of a fetal MRI phantom. A fetal MRI phantom was defined as an artificial physical model that either simulates the human gravid uterus, fetal organs, or maternal body parts and which has undergone MR imaging. Non-English articles, non-MRI studies without the conjunction of MRI, non-(fetal or phantom) studies, as well as, animal studies or studies with a non-fetal (neonatal, infant, child, or non-pregnant adult) phantom or a non-human (animal, digital, voxel, numerical, mathematical or computational) phantom were excluded from the analysis.

Screening

Potentially relevant articles were first identified based on whether their titles included the following keywords: (‘fetal’ OR ‘phantom’) AND (‘MRI’). Articles with titles that made reference to any of the following exclusion criteria were excluded from abstract screening: non-English study, non-MRI study, a non-(fetal or phantom) study, an animal study or a study that used a non-fetal or a non-human phantom. The abstracts of the identified articles were then screened manually, and those not involving imaging phantom development or testing, or safety validation, accuracy, or motion-correction testing within a fetal MR study were excluded. The available full texts of the remaining articles were then reviewed, and only those satisfying all inclusion/exclusion criteria were included. Specifically, for the keyword title screening, alternative terms were identified prior to conducting the search, as shown in Table 1, and counted as equivalent terms. Duplicated search results were not removed.

Table 1 Alternate terminology in keyword title screening

Methodology review

To validate the accuracy and ensure the robustness of the described methodology, three equally trained research assistants conjointly followed the search strategy and repeated the literature search and screening.

Evaluation criteria

In our analysis, and based on the available literature, we are considering a fetal AMP for a given GA accurate if it:

  1. 1.

    Mimics the anatomy of the simulated body parts, providing a more realistic phantom for testing of MR pulse sequences [12].

  2. 2.

    Mimics the dielectric conductivity of the simulated tissues, so that it reproduces dielectric and chemical shift artifacts [30], correctly simulates the Specific Absorption Rate (SAR) deposition of an imaging sequence, and thus, accurately models MR-induced heating effects [31].

  3. 3.

    Mimics the proton relaxation times of the simulated tissues, so that it correctly reproduces the contrast of a fetal MR image, thus guiding the selection of optimized parameters for a pulse sequence [32].

  4. 4.

    Simulates spontaneous gross fetal movement, and cardiovascular and respiratory motion, so that it mimics physiological MAs, allowing to create and test GA-specific motion-correction algorithms [12, 13].

Quantitative phantom analysis

The accuracy of the phantoms identified from the included studies was determined based on four equally weighted evaluation categories and eight subcategories, with each phantom being assigned up to a maximum of 12 merit points as follows:

Category 1: anatomical accuracy (3 points total)

Each phantom was awarded 1.5 points for anatomically accurate shape and 1.5 points for anatomically accurate size for the reported GA. Anatomical shape counted as accurate if the phantom was constructed directly from imaging scans or by exactly mimicking the anatomical profile of the simulated tissues using literature values. Anatomical size counted as accurate if the phantom was created based on anatomically accurate dimensions. If the shape was simplified or the dimensions scaled, 0.75 points were awarded. If the shape design or the dimensions were not discussed, no points were awarded.

Category 2: dielectric conductivity (3 points total)

Each phantom was awarded 3 points for mimicking the conductivity of all individual tissues being simulated for the specific GA. If only the average conductivity of the adult human body was simulated, 1.5 points were awarded. If only some but not all tissues were simulated, 1.5 points were awarded.

Category 3: relaxation times (3 points total)

Each phantom was awarded 3 points for mimicking the relaxation times of all individual tissues being simulated for the specific GA. If only some but not all tissues were simulated, then 1.5 points were awarded

Category 4: physiological motion (3 points total)

Each phantom was awarded 1 point each for mimicking fetal gross body movement, cardiovascular function, and respiratory motion with 100% accuracy. For fetal cardiovascular motion, simulating cardiac contraction and cardiovascular flow were awarded 0.5 points each for 100% accuracy. 100% motion accuracy was determined if the frequency of the simulated motion was within range of the expected fetal trunk, cardiovascular, or respiratory movement rates for the respective GA. If the simulated motion frequency was out of the expected range, the percent motion accuracy of the expected value was calculated and the awarded merit points were scaled accordingly. For example, the following calculation shows how points were awarded for a phantom with 100% (1) and sub-100% (2) cardiovascular motion accuracies:

  1. 1.

    36 week cardiac phantom by Kording et al. [7].

    1. (a)

      Simulated motion frequency: 75–200 BPM for both cardiac contraction and cardiovascular flow.

    2. (b)

      Expected motion frequency at 36 weeks: 137 BPM [21].

    3. (c)

      Conclusion: 137 BPM is within the simulated range. Therefore, cardiac contraction and cardiovascular flow accuracy are both 100%.

    4. (d)

      Points awarded for simulating cardiovascular function: \(0.5 \times 100\% + 0.5 \times 100\% = 1\;{\text{point}}\).

  2. 2.

    37 week blood vessel phantom by Jansz et al. [6].

    1. (a)

      Simulated motion frequency: no cardiac contraction with a maximum achievable cardiovascular flow frequency of 80 BPM.

    2. (b)

      Expected motion frequency at 37 weeks: 136 BPM [21].

    3. (c)

      Conclusion: 136 BPM is outside of the simulated range. Therefore, cardiovascular flow accuracy is: \(\left( {80\;{\text{BPM}}/136\;{\text{BPM}}} \right) \times 100\% = 58.82\%\).

    4. (d)

      Points awarded for simulating cardiovascular function: \(0.5 \times 0\% + 0.5 \times 58.82\% = 0.29\;{\text{points}}\).

To the best of our knowledge, there have not been any comprehensive relaxometry and conductivity studies for fetal tissues for specific GAs. Thus, the accuracy of the reported relaxation times and conductivity values could not be verified and full points were awarded if these values were estimated based on theoretical calculations. These categories and merit points are summarized in Table 2.

Table 2 Phantom evaluation criteria point system

Statistical analysis

Raw merit scores could not be used to provide a reliable review of the fetal MRI phantom field due to the nonhomogeneous point system. As a result, percentage accuracy was introduced to provide an unbiased parameter of the accuracy of each phantom. Percentage accuracy was calculated as the normalized points per evaluation category and reported as percentage values. This also allowed for a quantitative comparison between 3D printing versus conventional phantom synthesis methods, such as mold casting, use of household materials, and methods requiring only manual assembly. Using percentage accuracy also facilitated an unbiased comparison between the evaluation categories and subcategories to determine the status of the fetal MRI phantom field. This was complemented by a GA analysis to determine the trimester applicability for each phantom. Applicability was determined by the trimester which encompassed the majority of the reported GA range. However, if the reported GA range covered 50% or more of multiple trimesters, the phantom was recorded as applicable for each eligible trimester. The weekly ranges used were 0–13 weeks for the first trimester, 14–27 weeks for the second, and 28–40 weeks for the third, as per the guidelines provided by the American College of Obstetricians and Gynecologists [33]. Two-way unbalanced ANOVA followed by Tukey Honest Significant Difference tests were conducted for all analyses using Microsoft Excel 2016 (Microsoft, Seattle, USA) with an XLSTAT add-in (Addinsoft, Paris, France). P < 0.05 was considered statistically significant. To account for the non-normal data distributions within the synthesis method and evaluation category analyses, bootstrapping with 1999 resamples was conducted after the hypothesis testing [34]. The standard deviation of the resampled means was then calculated to approximate the standard error of the mean (SEM) for each variable. The graphs and statistical results for these analyses are reported as bootstrapping mean ± SEM. While the reported evaluation category bootstrapping means were used to report the overall percentage accuracy per category, the sub-category results were reported as non-bootstrapping percent contributions as an approximation. Percent contribution was calculated as the normalized points per sub-category.

Results

Literature search results and methodology review

A total of 577 articles were identified according to the search criteria. A keyword search of their titles led to the removal of 331 articles. The abstracts of the remaining 246 articles were reviewed, of which 162 were excluded during abstract screening. Following a full-text review of the remaining 84 articles, 17 articles remained and were included in the narrative review paper. A flowchart of these search results is illustrated in Fig. 1 and the characteristics of the included studies and their corresponding fetal MRI phantoms are summarized in Table 3. A further summary of the identified phantoms is provided in Table 4 in the Appendix.

Fig. 1
figure 1

Methodology and results of the narrative review

Table 3 Characteristics of the fetal MRI phantoms identified in the narrative review
Table 4 Summary of the identified phantoms

The repeated literature search confirmed the accuracy of the methodology; the same number of articles were identified and excluded in each screening step.

Quantitative phantom analysis

The results of the quantitative phantom analysis are shown in Table 5 and the percentage accuracy of each phantom per publication year is shown in Fig. 2. The statistical analyses for the synthesis method and evaluation category comparisons are shown in Figs. 3 and 4, respectively. The average overall accuracy for each phantom was 26 ± 5%. The anatomical evaluation category had the highest percentage accuracy of 56 ± 11%, while the least accurate was physiological motion with 7 ± 3% accuracy (p = 0.0021). Simulating anatomically correct shape and size equally contributed to the average anatomical accuracy (Fig. 4). The only source of motion simulation that our literature review found was that of cardiovascular function, with 60% of the accuracy arising from simulation of cardiovascular flow, as depicted in Fig. 4. However, none of the phantoms simulated gross body or respiratory motion. This demonstrates a significant gap in the literature since the identified phantoms were more likely to be anatomically correct than physiologically accurate. We also determined that 3D-printed phantoms had a higher average overall percentage accuracy (58± 15%) than traditionally constructed phantoms (19 ± 4%) (p = 0.001). While the power to draw a conclusion is limited because only three 3D-printed phantoms were identified, this indicates that using 3D printing tends to result in more accurate phantoms. Nonetheless, this limitation is expected since 3D printing is a relatively new technique that was first used for phantom manufacturing only in 2015. We also noted that the fetal MR phantoms identified were of either third trimester (n = 10) or second trimester (n = 8), while there were no phantoms of first trimester.

Table 5 Merit points awarded for each identified phantom based on chosen evaluation criteria
Fig. 2
figure 2

Percentage accuracy per publication year of fetal phantoms

Fig. 3
figure 3

Percentage accuracy comparison of 3D printing and traditional phantom synthesis methods per evaluation category (**p < 0.01)

Fig. 4
figure 4

Percentage accuracy per evaluation category, and percent contribution of each sub-category to the total percentage accuracy of each category (**p < 0.01)

Fig. 5
figure 5

Various 3D-printed compartments of the anthropomorphic MR phantom of the gravid uterus developed by García-Polo et al. [8]

Fig. 6
figure 6

A to-scale fetal MRI phantom constructed using saline bags, developed by Victoria et al [35]. Reprinted by permission from Springer Nature: Springer-Verlag. Pediatr Radiol. Fetal magnetic resonance imaging: jumping from 1.5 to 3 Tesla (preliminary experience). Victoria T, Jaramillo D, Roberts TPL, et al. ©2014

Fig. 7
figure 7

Array coil posed on the anthropomorphic to-scale gravid uterus phantom used by Spatz [36]

Fig. 8
figure 8

4 –7 mL lung models developed by Büsing et al. [40] imaged with half-Fourier acquired single-shot turbo SE. Reproduced with permission from Büsing KA, Kilian AK, Schaible T, et al. Reliability and Validity of MR Image Lung Volume Measurement in Fetuses with Congenital Diaphragmatic Hernia and in Vitro Lung Models. Radiology 2008;246:553–561. Radiological Society of North America

Fig. 9
figure 9

Hand-fashioned model mimicking the fetal lung developed by Kehl et al. [41]

Fig. 10
figure 10

The dimensions and setup of the fetal MRI biventricular heart phantom developed by Kording et al. [7]

Fig. 11
figure 11

Fetal cardiac phantom compensated with maternal respiratory motion developed by Antoni et al. [16]: the photograph (a) and schematic drawing (b) show a UR5 robot setup (A), heart phantom (B), CTG (C), US damping plate (D), tank (E) and linear joint (F) used to move the heart phantom. Reprinted by permission from Springer Nature: Springer Nature. Int J Comput Assist Radiol Surg. Model checking for trigger loss detection during Doppler ultrasound-guided fetal cardiovascular MRI. Antoni S-T, Lehmann S, Neidhardt M, et al. ©2018

Discussion

Overview of the fetal MRI phantom field

Based on our chosen evaluation criteria, the average percentage accuracy for the fetal phantoms identified in our literature search was only 26%, with the best phantom attaining a 75% accuracy. While the first published phantom was constructed in 1985, the majority of the identified phantoms were constructed between 2017 and 2019. This suggests that the use of fetal MR phantoms is increasing in popularity, reinforcing the need for a quantitative analysis of their accuracy to support and facilitate future development in the field. Our analysis also determines that the majority of fetal phantoms are for second or third trimester.

The evaluation categories chosen to measure the accuracy of each phantom were equally weighted since the importance of each category is highly dependent on the phantom application. For example, anatomical accuracy is very important for the development of complex gravid uterus phantoms that are used as a proxy to human subjects to develop and test new fetal MRI pulse sequences [8, 36]. Simulating proton relaxation times is especially important when testing diagnostic sequences which monitor differences in relaxation times in diseased tissues, such as in the development of R2* mapping to diagnose ischemic placental disease [45]. Mimicking tissue conductivity is essential when developing phantoms to evaluate MRI-related heating effects [31], such as when assessing the safety of an occlusive balloon in MR-assisted fetoscopic tracheal occlusion [42]. Moreover, simulating accurate physiological motion is mandatory for the development of MRI triggering techniques for fetal cardiovascular MRI [16] or to validate metric optimized gating phase‐contrast MR flow measurements [6, 44]. Given that all categories were equally weighted, the results suggest that the currently available fetal phantoms are significantly more likely to be anatomically correct than physiologically accurate. Namely, 8 out of the 17 identified phantoms simulated accurate shape and size for GA, while another 3 phantoms used a simplified and/or scaled model of the anatomy. Smaller phantoms require shorter image acquisition times than full-scale phantoms due to a smaller field of view and slice thickness; however, to achieve proportional image quality in small phantoms relative to full-scale phantoms, higher image resolution would be required [46]. Scaling or simplifying the anatomy also negatively affects which anatomical features can be distinguished in a phantom, making abnormality detection less accurate [47]. The importance of preserving both anatomically correct shape and size was evident as both subcategories equally contributed to the overall anatomical accuracy of each phantom. Thus, phantoms should be anatomically correct in terms of both size and shape to replicate the technical challenges of true human fetal MRI and permit more accurate validation and testing of new fetal MRI sequences [12]. Anatomical category also had the highest percentage accuracy (56%) among all evaluation categories, reinforcing that anatomical accuracy is currently the most highly regarded criterion by fetal MR phantom developers.

The next evaluation categories with the highest percentage accuracy were relaxation times at 24%, followed by dielectric conductivity at 18%. While anatomical accuracy is usually the first consideration when constructing a phantom, to accurately test a new MR pulse sequence, correctly simulating the relaxation times of tissues is just as crucial. A difference in relaxation times can markedly affect the contrast of the image and the parameters needed for optimized imaging protocol [18]. Refining a sequence using incorrect tissue relaxation times may result in poor image contrast when using the sequence on a human subject. Thus, having a phantom that mimics the tissue relaxation times can reduce the amount of time needed to test and refine an imaging sequence once it is translated into clinical applications. Similarly, dielectric conductivity must also be considered so that the phantom simulates the effects of the fetal tissue composition under MRI. Several image artifacts, such as central brightening [32], dielectric artifacts, and eddy current artifacts [30] depend on the conductivity of the material. Hence, simulating conductivity-related artifacts and the resulting image quality requires accurate simulation of material conductivity in the phantom. Moreover, accurate simulation of tissue conductivity permits evaluation of heating artifacts and calculations of the resulting SAR [31]. To the best of our knowledge, relaxometry of fetal or maternal tissues has only been quantified for the fetal brain [48], umbilical cord blood [43] and placenta [45], while conductivity has not been documented. As such, these values for other tissues can only be determined theoretically, as was done by García-Polo et al. [8] and Spatz [44]. Due to the lack of literature on this topic, the accuracy of the tissue relaxation times and conductivity simulated in the identified phantoms could not be evaluated. Since these values change throughout pregnancy due to the rapid growth, development and maturation of fetal tissues, a study that measures these values throughout pregnancy, and particularly between the age of viability (mid-second trimester) and term, would be valuable for facilitating accurate synthesis and design of fetal phantoms for specific GAs.

Very limited research has been conducted on replicating physiological motion in fetal MRI phantoms. Out of the 17 phantoms identified, only 7 were motion phantoms, of which only cardiovascular function was simulated. No phantom simulated spontaneous gross fetal body movement or fetal respiratory motion. Further in-depth research is needed to enable simultaneous accurate simulation of the different types of physiological motions in a fetal AMP. Such simulation would allow generating physiologically accurate MAs to test motion-correction approaches [13, 49] and the resulting SNR and contrast [49, 50]. Additionally, the refocusing RF pulses in Carr–Purcell–Meiboom–Gill (CPMG)-type sequences can cause tissue heating, which increase the need to monitor tissue temperature [1]. Accurately simulating motion would permit accurate MR gradient selection, which is of utmost importance to ensure fetal safety [1].

3D printing and traditional phantom manufacturing approaches

Our quantitative analyses also determined that phantoms constructed using 3D printing had a significantly higher overall percentage accuracy than phantoms constructed using other traditional methods, which indicates that 3D printing technology facilitates creating more accurate fetal MRI phantoms. While this conclusion is limited because of the small number of 3D-printed phantoms published to date, this new technique presents a promising development in the fetal MRI phantom field. Traditional phantoms, such as ones made through mold casting and those consisting of household materials and requiring only manual assembly, have several limitations. Most prominently, these phantoms cannot efficiently replicate the complex geometry of the human body [13, 51]. Specifically, the traditional techniques which were observed in the narrative review lacked the customizability necessary to replicate complex fetal anatomy. This was observed with 8 of the 14 traditional phantoms, which could not fully replicate the size and shape of the fetal and maternal anatomy [15, 16, 35, 37,38,39, 42, 45]. For example, Chen et al. [37] used a body-sized phantom, filled with a tissue-mimicking solution, to test a pregnant abdomen coil array. Although the phantom incorporated the approximate dimensions and even the rounded structure of the human torso, its estimated dimensions failed to simulate the topography of the pregnant abdomen. Another limitation is that the phantom was unable to simulate the internal anatomy of the gravid uterus due to its single-chamber structure. The remaining 6 traditionally constructed phantoms that were anatomically correct also had several disadvantages [6, 18, 40, 41, 43, 44]. For example, the lung phantoms produced by Kehl et al. [41] and Büsing et al. [40] were constructed from ‘lung-shaped’ sheaths filled with variable amounts of gelatin. Although the phantom procedure was simple and easy to follow, the manual construction method is susceptible to anatomical variability and subjectivity by the person constructing the phantoms. The use of 3D printing technology, which allows for direct printing of highly precise and complex structures from 3D fetal images, allows overcoming these issues, permitting more versatile, patient-specific, and relatively inexpensive prototyping of MRI phantoms [13, 51].

3D printing enables direct printing of the phantom or indirect printing of a mold, which is then filled with material which, when cured, is taken out of the mold [13, 52]. Direct 3D printing could also include printing a solid phantom or a hollow one which can then be filled with tissue-mimicking materials. Filippou [13] conducted a systematic review of 50 articles that developed 3D-printed phantoms for CT, MRI, PET, SPECT, ultrasonography and/or mammography modalities. They determined that most of the 3D-printed phantoms had accurate dimensions [13]. Although 3D printers were shown to have an almost perfect anatomical accuracy [13], this accuracy depends on the prevalence of geometric distortions acquired during several manufacturing steps, including data acquisition, image processing, mesh refinement, and model manufacturing [13]. Key issues identified with the use of 3D printing include challenges with eliminating air bubbles in the tissue-mimicking solution, damaging the phantom when removing the support material, and using a sufficiently high imaging resolution which ultimately depends on the imaging modality [13]. Filippou [13] proposed several techniques for optimizing the manufacturing and the resulting anatomical accuracy, such as the use of a high-resolution CT scanner for data acquisition, adaptive slicing, and minimizing the amount of support material. Other challenges with 3D printing are related to simulating tissue anisotropy and strain-stiffness mechanical properties [51], which is essential for the accurate design of fetal cardiac and respiratory function in an AMP. Although the Young’s modulus can be designed to be the same as that of the tissue being simulated, the creep tendency of polymeric materials results in a different stress–strain response under large deformations, making it challenging to maintain the desired strain-stiffness properties [51]. To overcome this issue, multi-material 3D printing technology can be used to seamlessly manipulate the mechanical properties of the phantom [51]. For example, dual-material 3D-printed metamaterials with microstructured reinforcement embedded in a soft polymeric matrix have been shown to mimic the strain-stiffening behaviour of soft tissues [53, 54].

Importantly, 3D printing can simplify the construction of anthropomorphic phantoms, which are anatomically accurate and simulate the specific fetal tissue conductivity and relaxation times. By this definition, only two anthropomorphic phantoms were identified in the review [8, 36]. Both gravid uterus phantoms consisted of directly 3D-printed hollow fetal and maternal organ compartments which were filled with agar-based tissue-mimicking gels. While this approach created the two most accurate phantoms, its disadvantage is the introduction of plastic boundaries in the 3D-printed compartments, which did not have the same dielectric and relaxation properties as the tissue-mimicking gels [8]. Ideally, one would be able to directly 3D print a complex fetal MRI phantom that possessed accurate dielectric and relaxation properties [52]. However, research in this field is very new and only a few studies to date have been able to manufacture 3D-printed MRI phantoms that produced a quantifiable MR signal; yet, none were for fetal MRI applications [52, 55].

Narrative review limitations

The main limitation of the current review is its very narrow inclusion criteria since only physical fetal MRI phantoms were evaluated. However, it is important to note that examining digital fetal MRI phantoms may be a valuable reference when building a physical model. In addition, the development of a fetal AMP might benefit from a broader review including neonatal phantoms as well as non-fetal AMPs. The effects of maternal motion can also be studied, as maternal motion from various causes (e.g. breathing, physiological bowel motion) has also been shown to cause MAs [16]. Furthermore, there is a growing interest in understanding maternal–fetal metabolic interactions [56], along with the respective anatomy, which would render a combined anthropomorphic MRI/MRS phantom a potentially useful tool for testing and developing multi-parametric pulse-sequence protocols [57].

Conclusion

With a fetal AMP, sequences and algorithms which correct for fetal motion without compromising image quality can be developed, resulting in improved prenatal diagnostic capabilities. Specifically, a fetal AMP would significantly decrease sequence development time and cost by permitting simultaneous multi-parametric optimization and avoiding the use of animal models or recruitment of pregnant patients. To survey the current fetal MRI phantom field, our comprehensive narrative review identified 17 fetal MRI phantoms. Each of the identified phantoms was then quantitatively analyzed based on their anatomical accuracy, dielectric conductivity, relaxation times, and physiological motion properties. The average overall accuracy among all phantoms was only 26% and phantoms were more likely to be anatomically than physiologically accurate. This indicates that extensive further research needs to be conducted to construct a fetal AMP which can simultaneously simulate accurate cardiovascular, respiratory, and gross body motion. Our quantitative analysis also revealed that 3D printing is superior to traditional methods since it facilitates the synthesis of overall more accurate phantoms. Yet, the most significant disadvantage of 3D printing phantoms is the introduction of plastic boundaries which do not simulate accurate conductivity and relaxation times. An additional challenge is the lack of comprehensive fetal relaxometry and conductivity studies. Therefore, future research should focus on developing GA-specific tissue-mimicking materials which can be directly 3D-printed. Finally, the next generation of fetal MRI phantoms should also use multi-material 3D printing with flexible solid materials to seamlessly control the mechanical properties of cardiovascular and respiratory soft tissues.