Introduction

One of the most frequently asked questions associated with the increasing clinical use of radiologic imaging is the management of radiation dose for patients presenting with high radiosensitivity, including pregnant female and pediatric patients [1,2,3]. Radiologic imaging should be avoided on pregnant females; however, in some very specific situations, computed tomography (CT) or even positron emission tomography (PET)/CT becomes unavoidable [4]. Under these circumstances, the radiation risks to the fetus are a significant concern. At fetal doses greater than 50 mGy, the potential hazard effects include embryonic death, intra-uterine growth limitation, average intelligence quotient (IQ) loss, mental retardation, organ malformation, and small head size [5,6,7]. Stochastic effects might also occur at fetal doses below 50 mGy [8,9,10]. According to ICRP publications [6, 11], the termination of pregnancy at fetal doses below 100 mGy cannot be justified based on radiation risks to the fetus while for fetal doses between 100 and 500 mGy, the decision of pregnancy termination should be considered based on individual circumstances. In this context, the accurate estimation of conceptus dose plays a key role in managing safety and quality in CT imaging procedures for pregnant patients, keeping in mind the current international recommendations in terms of good practice and diagnostic reference levels [12].

Different techniques have been adopted for estimating the conceptus dose in pregnant patients, including Monte Carlo calculations using dedicated computational anthropomorphic models [2, 13,14,15,16,17,18,19,20,21,22,23,24] and experimental measurements using physical phantoms [25,26,27,28]. However, these techniques naturally bear many limitations, including the difficulty of accurate modeling of individual patient’s anatomy considering the location, shape and size of internal organs, and the uterus/fetus within the target region of the pregnant patient’s body [29]. The inherent assumptions in the modeling and simulation setups might result in substantial overestimation or underestimation of the conceptus dose. Monte Carlo calculations are commonly integrated with computational phantoms to simulate radiation transport and account for all aspects of particle interactions within the human body. The technique is deemed to be the most accurate method for radiation absorbed dose calculation [30]. The relevance and reliability of Monte Carlo–based calculations are closely related to the adopted computational model, which reflects the physical characteristics and anatomical features of the human body. In this regard, the accurate delineation of the anatomical structures of the human body is required to estimate organ doses and radiation risks [31]. This task is referred to as “image segmentation” and is commonly part of the process of constructing computational phantoms. However, the manual delineation of contours and internal organs is usually time-consuming and suffers from intra- and inter-observer variability.

Deep learning algorithms have recently been applied in the field of image analysis and radiation therapy [32]. These techniques proved superior compared with previous state-of-the-art segmentation methods. Instead of describing the tissue patterns manually, deep learning methods discover multiple levels of representation and abstraction and use hierarchical layers of learned abstraction to understand image characteristics. Different categories of deep learning algorithms were used, like recurrent neural networks [33], gated recurrent units [34], deep belief networks [35], Hopfield neural networks [36], generative adversarial networks [37], Boltzmann machines [38], or convolutional neural networks (CNNs) [39]. The latter have many interesting features, such as simple structure, less training parameters, and adaptability. Their network structure resembles biological neural networks, which rendered them suitable for pattern recognition and medical image analysis. Zhou et al [40] trained a CNN for mapping each voxel automatically on 3D CT images to an anatomical label whereas Liu et al [41] adopted super-pixel and CNNs for automatic segmentation of the liver and lung from CT images. Likewise, Weston et al [42] performed segmentation of abdominal CT images for body composition analysis using CNNs based on the U-Net architecture.

Image segmentation of organ/tissues from medical images is the first step for the construction of computational models, commonly manual and time-consuming. In this work, we report on an automated methodology for the construction of computational models using CNN-based image segmentation for estimating the conceptus doses of pregnant patients from CT examinations.

Materials and methods

Data acquisition

This study included thirty-two pregnant patients referred to the emergency unit of Geneva University Hospital (HUG) for abdominal and pelvic CT scans. The institutional ethics committee approved this retrospective study and written informed consent was waived. The patients were scanned using a volumetric CT data acquisition protocol on the Discovery CT 750 HD CT scanner (GE Healthcare) (120-kVp tube voltage, 22.9 effective mAs tube current without current modulation, a pitch of 1.375, slice thickness from 0.625 to 3 mm, and table speed of 55 mm/rotation). CT images were reconstructed using a matrix size of 512 × 512. All pregnant patients receiving low-dose CT scans between March 2011 and September 2018 were included. The 32 pregnant patients (age range 19–45 years) had gestational ages ranging from 8 to 35 weeks. The body contour, skeleton, liver, kidney, lung, and uterus were manually contoured by an experienced medical physicist under the supervision of qualified radiologists. Thirty-two voxelized computational models resulting from the manual segmentation of pregnant patients’ CT images were created to derive reference dosimetric estimates. The right and left kidneys and lungs were both segmented together as one organ. The maternal perimeter determined by the maximum outer perimeter of the patient on the image containing the uterus was measured automatically from the manually constructed model of each patient. Likewise, the average distance from the skin to the closest surface of the uterus was used as an indicator of the conceptus depth. These two measurements were used to estimate the size of the maternal body and the location of the uterus within the mother’s abdomen, respectively.

The segmented volumetric images and original CT data of 32 patients were resampled to a unified matrix size of 128 × 128 × 64 with a slice thickness of 2 mm. Data augmentation was performed to increase the size of samples to 128 datasets by randomly cropping the edges of the original volumetric CT images. The original CT and corresponding segmented images were similarly cropped for consistency. Eighty percent (102) of the 128 datasets were randomly selected for training whereas the remaining (26) datasets were used for testing. The training dataset was used to adjust the parameters of the proposed CNNs, whereas the test set was used to assess their performance. The manual segmentation served as reference for evaluation.

Convolutional neural network–based segmentation

CNNs have become the most popular deep learning algorithms for medical image analysis. The U-Net architecture of CNNs integrates spatial and contextual information in a network architecture comprising an analysis path and a synthesis path for pixel-wise prediction of the label probability of classified tissues and organs. In this work, we adopted a modified 3D U-Net [43, 44] model to segment the internal organs from CT images for Monte Carlo–based radiation dose calculations. Figure 1 shows the geometry of the proposed CNNs, which consists of an encoder module and a decoder module, each with three resolution steps of 32, 64, and 128 feature maps. In the encoder module, each layer contains two 3D convolutions followed by a rectified linear unit (ReLu), and then a 3D max pooling with stride size of 2 × 2 × 2. In the decoder module, each layer consists of 3D upsampling with 2 × 2 × 2 strides followed by two 3D convolutions and a ReLu. For each convolutional layer, the size of all convolutional kernels is 3 × 3 × 3.

Fig. 1
figure 1

Proposed convolutional neural network model used for automated segmentation of pregnant patients’ CT images. BN refers to batch normalization whereas ReLu refers to rectified linear unit activation function

Shortcut connections between layers of equal resolution transfer the high-resolution features from the analysis path to the synthesis path. The numbers of channels were doubled before max pooling (sample-based discretization process) to avoid bottlenecks. Batch normalization was introduced before each ReLu to normalize the batches with mean and standard deviation, thus updating the global statistics during training. The input to the network is a 128 × 128 × 64 voxel matrix of the image with 7 channels. Our output in the final layer is 120 × 120 × 128 voxels in x, y, and z directions, respectively. The architecture has 3,192,871 parameters in total and comprises a dilated convolutional layer, 16 convolutional layers, and a series of pooling options.

The input images and their corresponding segmentation maps were used to train the network with the adaptive moment estimation (Adam) implementation of Keras [45], which computes adaptive learning rates for each parameter and keeps an exponentially decaying average of past gradients:

$$ {m}_t={a}_1{m}_{t-1}+\left(1-{a}_1\right){g}_t $$
(1)
$$ {v}_t={a}_2{v}_{t-1}+\left(1-{a}_2\right){g}_t^2 $$
(2)

where mt and vt are estimates of the mean and the uncentered variance of the gradients, respectively. gt is the gradient at subsequent time step t and a1 and a2 are exponential decay rates with a1, a2∈ [0, 1).

The categorical cross-entropy loss function was used in the network training to evaluate the price paid for inaccuracy of predictions in classification. It is defined as:

$$ f{(p)}_i=\frac{e^{p_i}}{\sum_{j=1}^C{e}^{p_{\mathrm{j}}}} $$
(3)
$$ {L}_{\mathrm{CE}}=-{\sum}_{i=1}^C{t}_i\log \left(f{(p)}_i\right) $$
(4)

where LCE refers to the cross-entropy loss evaluation, f(p)i is the softmax function, pi are the scores inferred by the net for each class in C, ti indicates the ground truth label for voxel i.

The CNN model was implemented using the open-source Keras package [45] and ran on NVIDIA Quadro K5000 GPU with 32 GB of memory and ubuntu operating system. To minimize the overhead and make maximum use of the GPU memory, the network was trained with a mini-batch size of 2 and for 100 epochs. The learning rate was initialized as 10−3, and the network weight was initialized as normal distribution with zero mean and a standard deviation of 0.01.

Computational phantoms

Overall, the manual segmentation of the original CT images for 32 patients took approximately 1 month while the automated segmentation of one CT dataset, following training that took 22 days, takes 20 s. A total of 6 organs/tissues were automatically or manually segmented from 3200 transaxial slices followed by assignment of a label to each one. Organ masses were calculated by multiplying the number of segmented voxels by the voxel size and the corresponding tissue density reported in the ICRP publication 89 [46]. Figure 2 shows the representative transverse, sagittal, and coronal slices of the original CT images and those generated by manual and automated segmentation techniques. Figure 3 shows the 3D coronal and sagittal views of the computational models generated using manual and automated segmentation techniques where the body is made transparent for enhanced viewing of internal organs and the skeleton. A selected set of metrics were used to measure the agreement between manual and automated segmentations, including the Jaccard similarity coefficient, Dice similarity coefficient (DSC), sensitivity, positive predictive value (PPV), volume difference, and Hausdorff distance (HD). The Jaccard coefficient measures the similarity between two images and is defined as the volume of the intersection divided by the volume of the union of manual and automatic segmentations. DSC is similar to Jaccard and is calculated as DSC = 2 × Jaccard/(1 + Jaccard). The sensitivity describes the ability of the CNN to correctly classify an individual voxel as belonging to the right target organ and equals the true positive divided by the sum of true positive and false negative rates. The PPV is the percentage of the correct identification in the automated segmentation results and can be defined as follows: PPV = true positive/(true positive + false positive). The measurement of HD reflects the translations and shape discrepancies between the computational phantoms produced using manual and automated segmentation techniques. The voxel-to-voxel distance can be calculated as:

$$ HD=\max \left\{{\mathit{\max}}_{\mathrm{i}}{\mathit{\min}}_j\left|\left|{a}_{\mathrm{i}}-{b}_{\mathrm{j}}\right|\right|,{\mathit{\max}}_{\mathrm{i}}{\mathit{\min}}_{\mathrm{j}}\left|\left|{b}_{\mathrm{i}}-{a}_{\mathrm{j}}\right|\right|\right\} $$
(5)

where ||ai − bj|| is the Euclidean distance of point a in the manually segmented phantom to point b in the automatically segmented phantom.

Fig. 2
figure 2

Illustration of the CNN-based image segmentation on a representative patient study showing transaxial, sagittal, and coronal slices of the original CT images (left column), manual segmentation (middle column), and automatic segmentation (right column)

Fig. 3
figure 3

3D views through representative slices of the generated pregnant woman computational models using manual segmentation (left column) and automated segmentation (right column)

Monte Carlo simulations and radiation dose calculations

Monte Carlo modeling was performed using a previously validated GE 750HD CT source where the CT gantry geometry model includes a Performix Pro VCT 100 x-ray tube with 7° target angle and 56° fan-beam angle, allowing for a beam collimation of 40 mm [47]. At 120 kVp, the half-value layer was measured as 7.8 mm Al, whereas the quality equivalent filtration of the x-ray tube is 4.3 mm of Al. The source-to-isocenter and the source-to-detector distances for this CT scanner are 54 cm and 95 cm, respectively. The computational phantoms and CT source and gantry models were integrated in the MCNPX Monte Carlo code [48] for simulation of low-dose CT examinations with a helical source path and total collimation width of 64 × 0.625 mm. The energies deposited in the computational phantoms produced through manual and automated segmentation techniques were recorded and used to calculate organ absorbed doses. The estimated organ-level absorbed doses (in mGy) were used to compare the dosimetric characteristics of the computational models generated using manual and automated segmentation techniques.

Results

The results of automated segmentation are compared with those of manual segmentation of all patients with respect to the total body, skeleton, liver, lung, kidney, and uterus using different metrics (Table 1). The Jaccard similarity coefficient for the different segmented organs varies between 0.85 and 0.96 with an average of 0.90 ± 0.04; while the DSC varies between 0.92 and 0.98 with an average of 0.94 ± 0.02. Conversely, the sensitivity varies between 0.9 and 0.97 with an average of 0.95 ± 0.03, while the PPV varies between 0.92 and 0.98 with an average of 0.94 ± 0.02. The size and location accuracy of the segmented organs between the phantoms constructed by the two methods were evaluated by calculating the volume difference and HD. The former ranges from − 4.9 to 2.96% with an average of 0.26% ± 2.67%, while the latter varies between 10.71 and 50.07 mm with an average of 23.62 mm ± 12.86 mm among organs. It should be emphasized that the uterus dose served as surrogate for the conceptus dose in this work. For segmented uterus among patients, the Jaccard similarity coefficient, DSC, sensitivity, PPV, volume differences, and HD are 0.88 ± 0.06, 0.94 ± 0.04, 0.94 ± 0.05, 0.93 ± 0.03, 1.49% ± 4.19%, and 16.1 mm ± 8.16 mm, respectively.

Table 1 Summary of image segmentation metrics, including the Jaccard similarity coefficient, Dice similarity coefficient, sensitivity, positive predictive value, volume difference and Hausdorff distance, used for the comparative assessment of the automated image segmentation algorithm for six identified organs

Table 2 summarizes the calculated conceptus dose of pregnant patients according to gestation age, which varies between 8 and 35 weeks. The conceptus doses for the thirty-two pregnant patients estimated using computational models developed based on manual segmentation and automated segmentation are also given. The measured maternal perimeter varies from 65.2 to 127.7 cm with a mean value of 93.5 cm. The mean conceptus dose is 2.91 ± 0.7 mGy. Figure 4 compares the absorbed doses to the kidney, liver, lung, uterus, skeleton, and total body between the manual and automated segmentation models. Figure 5 shows the relative differences of absorbed doses in target organs between the two segmentation methods. The relative absorbed dose differences for the total body among the thirty-two patients range from − 0.21 to 1.48% with an average of 0.28% ± 0.39% whereas the dose differences for the skeleton vary between − 2.26 and 1.33% with an average of − 0.48% ± 0.91%. The dose difference for the uterus ranges from − 5.98 to 6.31% with an average of − 0.12% ± 2.62%. The absorbed dose in target regions with low density and small volume is more easily affected by organ’s shape and position. Therefore, the dose difference between manual and automated segmentations observed for the lungs is higher than that for the liver and kidneys while the skeleton and total body have the least dose differences. Figure 6 compares the relative differences of organ and conceptus absorbed doses between manual and automated segmentation techniques at different gestational periods, respectively. The mean conceptus dose differences between the two methods are 0.93% ± 2.5%, − 1.19% ± 2.97%, 0.053% ± 2.51%, and − 0.38% ± 1.28% at early pregnancy, the first trimester, the second trimester, and the third trimester, respectively. The error bars in Figs. 5 and 6 represent the minimum and maximum dose differences between manual and automatic segmentations among patients. The sources of uncertainty reflect the anatomic divergences of the segmented target regions. The dependence of the conceptus dose upon the maternal perimeter estimated using the technique described by Angel et al [13] is also compared in Table 2 and Fig. 6b.

Table 2 Summary of gestation age and waist perimeters of the thirty-two pregnant patients as well as the reported uterine doses in abdominal CT scans when using manual vs. automated segmentation techniques
Fig. 4
figure 4

Comparison between absorbed doses from abdominal CT scans to the kidney, liver, lung, uterus, skeleton, and total body estimated using the pregnant computational models produced by manual vs. automated segmentation. The error bars represent ± STD (standard deviations) of organ doses among the patients

Fig. 5
figure 5

Relative differences between absorbed doses (in %) from abdominal CT scans estimated in target organs identified in Fig. 4 when using computational models produced by manual vs. automated segmentation

Fig. 6
figure 6

a Relative dose differences to the uterus (in %) from abdominal CT scans when using computational models produced using manual segmentation and automated segmentation at different gestational periods. b The same as a between the computational models produced by manual segmentation and dose estimates using the model by Angel et al [13]

Discussion

A deep learning–based automated algorithm was optimized for successful automated segmentation of CT images of pregnant patients who presented with a range of anatomical variability. Produced patient-specific computational models were suitable for incorporation in dedicated Monte Carlo codes for radiation dose calculations. The evaluation of the proposed CNN algorithms using thirty-two clinical studies demonstrated a good agreement between automated and manual segmentation results (mean values of DSC, PPV, volume difference, and HD are equal to 0.95, 0.95, 0.26%, and 23.62 mm, respectively).

Accurate delineation of organs at risk for radiation dose estimation in radiation protection or treatment planning is critical, yet efforts to automate the process of constructing patient-specific computational models remain elusive. Our CNN-based approach provides a fully automated solution for radiation dosimetry in CT imaging. The approach does not require user interaction and enables efficient and automated segmentations for fast computational model construction and accurate radiation dose calculation, thus facilitating eventual use in the clinic. The goal of our framework is to perform automated delineation of the contours of internal organs at risk from abdominal CT scans with the aim to build patient-specific computational models for radiation dosimetry. Taking advantage of convolutional neural network architecture, automated feature extraction is conducted on patient CT images, demonstrating capability in identifying target organs.

The absorbed dose for the uterus is commonly used as a surrogate for the absorbed dose to the embryo/fetus in medical radiation dosimetry [5] while different methods consider the larger uterine volume compared with the non-pregnant female [5, 49]. The dose calculation process reported in this work has the advantage of taking the individual characteristics of the patient’s anatomy into the dose calculation algorithm. The radiation dose in diagnostic imaging procedures is normally lower than the dose from radiation therapy but still carries a risk that cannot be eliminated entirely.

The conceptus dose for pregnant patients from CT examinations varies within the range 1.8–4.7 mGy with a mean value of 2.9 mGy. The differences of mean conceptus absorbed doses between the two segmentation techniques are 1.66%, 2.55%, 1.84%, and 1.18% at early pregnancy, first trimester, second trimester, and third trimester, respectively. It is not clear why the highest difference was observed at the first trimester. The differences of mean conceptus absorbed doses between manual segmentation and the technique proposed by Angel et al [13] are 19.33%, 19.39%, 10.97%, and 16.37% at early pregnancy, the first trimester, the second trimester, and the third trimester, respectively. Overall, the automated segmentation algorithm generated computational phantoms that resulted in almost similar estimates of individual conceptus doses as manual segmentation.

This study demonstrates the feasibility of an automated construction of patient-specific computational models for organ-level dose estimation, using deep learning approaches. The quality of typical low-dose CT images made it difficult to segment all organs, and as such, we had to limit the number of segmented organs. Another limitation of this study is that the uterus dose is used as a surrogate for radiation dose estimation to the fetus in the computational model while the fetal position may vary among patients and at different gestation periods. This assumption adds uncertainties to the dose estimation process. Another limitation is that only a limited number of organs have been segmented from CT images, which prevented the estimation of all organ doses for pregnant patients. Overall, the computational time required for construction of patient-specific models using automatic segmentation for one dataset is approximately 20 s. With advances in computer technology, particularly grid and cloud computing, and advanced variance reduction techniques, the proposed approach could be clinically implemented for real-time calculation of individual radiation doses to patients.

Conclusion

Since the individual anatomic characteristics have a noticeable impact on conceptus dose estimates, this study shows that patient-specific computational models can be created using an automated deep learning–based segmentation algorithm. Therefore, radiologists could perform accurate patient-specific dose estimation for a variety of radiation exposure situations and specifically for the evaluation of conceptus dose. In most situations encountered in emergency units, the benefit of performing CT outweighs the radiation risk. However, the proposed approach for automated computational modeling and dose calculation can be useful for retrospective evaluation of radiation dose, for instance if the pregnancy was unknown when emergency CT was performed, for the decision-making process for high-dose procedures in clinical setting and also in research studies involving retrospective data analysis.