Keywords

1 Introduction

Routine first trimester US screening is performed in most developed countries at 11–13+6days weeks of fetal gestational age [1]. The aims of this scan are to confirm fetal viability and detect multiple pregnancy; and to screen major abnormalities such as chromosomal defects [2]. Early detection of major fetal abnormalities is important because early termination of pregnancy is associated with lower risks to the mother [3]. Hence, there is interest in investigating whether some of the diagnostic checks performed at the mid-trimester scan can be brought forward to first trimester screening. To allow this, fetal diagnostic planes have to be first detected by the clinician. This task is challenging for clinicians at early gestation. The automation of such task is the focus of this paper.

First trimester screening is traditionally done using 2D US imaging. The advent of 3D fetal US scanning has some potential benefits as it is possible to reconstruct anatomical planes after image acquisition. In addition, while several 2D images have to be obtained to visualize all important fetal anatomical landmarks, several fetal body parts can be imaged at once using one 3D US image which has the potential to reduce examination times and hence improve clinical workflow. For example, Dyson et al. [4] showed that the use of 3D US images provided additional information in 51 % of 103 anomalies when compared with 2D US alone. However in the first trimester, fetal organs are smaller compared to the second and third trimesters. Therefore, to allow for proper assessment of abnormalities, the detection of consistent and accurate biometry fetal planes is needed before performing measurements.

With the overall aim to aid sonographers in first trimester US image interpretation, we present here a method to detect the best biometry fetal head and abdominal planes, which are presented for sonographer guidance [1]. Specifically, we propose fetal localization in the sagittal plane using an object proposal approach and SRFs as a pre-processing step. Having localized the fetal region of interest, we partition it into two parts - head and body - using transfer learning based CNNs [5]. The best biometry planes for the head and abdomen are extracted by utilizing clinical knowledge of the position of fetal biometry planes within the structures. We compare our automatic method with manually selected planes by an experienced clinician.

2 Related Work

There are actually a number of methods proposed for the detection from fetal US data [68]. Maraci et al. [6], for example, used dynamic texture analysis and Support Vector Machines (SVM) for classification of each frames from 2D US videos in second trimester. Chen et al. [8] used transferred recurrent neural network to automatically detect the standard plane for head and abdomen. These methods showed promising results, however, their data was taken based on fetal axial planes which means the fetal parts cover most of the image which makes it less challenging to detect the fetus. Most of all, our data does not require time to search for each fetal parts since it is 3D. In addition, the gestational age range was at later trimester which in general makes the fetal US structure clearer. On the other hand, our data is taken based on Crown Rump Length (CRL) protocol and we are focused on first trimester fetal data. Since the fetal volumes are acquired in the sagittal plane to allow all the fetus to fit in the volume, fetal structures typically appear smaller than non-fetal parts.

3 Localization of the Fetus in the Sagittal Plane

In our solution, the first task is to remove the non-fetal parts to simplify the localization of the fetus. Specifically, we localize the bounding box which encloses the fetus in the sagittal plane. We exploit the fact that the boundary between amniotic fluid and the fetus is generally clear. Therefore, a method that exploits these sharp boundaries is developed to facilitate localizing the fetus by guiding the bounding box localization around the fetus. The main issue with relying on fetal edges is the fact that the boundary between the back of the fetus and non-fetal tissue is typically ambiguous. To address this, we propose a method which learns the best bounding boundary box from a set of candidate boxes. The following two subsections describe our proposed method.

3.1 Edge Detection Using Structured Random Forests

Inspired by Zitnick, et al. [9], we propose a method which generates a number of bounding boxes as object proposals based on edge detection using SRFs [10]. SRFs directly predict the local structure of an image patch by using structural information of pixel neighborhoods to focus on certain patterns in the image patch. This produces much cleaner edges which in turn provides better bounding boxes. In this work, we used a 48 × 48 image patch size to capture more global structure appearance rather than local since small edges are less important for detecting the boundary of the fetus in our application. We used intensity, gradient magnitudes and gradient orientations features to train SRFs.

3.2 Detection of the Fetus Region-of-Interest

Given the edges from an SRF, a score to measure the number of edges enclosed within the region of interest (ROI) compared to the number of edges overlapping the ROI boundary is computed [9]. The larger the score the more enclosed object within the box. Relying on the box with the largest score to identify the best enclosing box around the fetus does not guarantee a proper localization of the fetus. Therefore, a set of candidate boxes are retained and the appearance of the best box which surrounds the fetus is learnt using a RF classifier.

In the RFs classifier [11] which learns the appearance of the best bounding box, we use unary, binary feature from both raw and signed symmetry-mapped images, and also use Haar-based features only from raw images. Signed symmetry-mapped images are pre-processed images that show local phase information such as local maximum and minimum of intensity. They are known to be robust to speckle and the low contrast nature of US images [7]. With this classifier, we classify the boxes into two classes: the box with the whole fetus (positive) and the box without the whole fetus (negative). Within the positive boxes, the top three boxes with the largest class probabilities are chosen. Empirically, using more than three boxes worsen the accuracy because of using more irrelevant boxes. Therefore, we only used the top three boxes in this work. The final bounding box is then found using the following empirical approach to ensure that the final box contains the whole fetus without losing any fetal parts. The average box from the three candidates is computed and the resulting box is considered correctly localized if the Intersection over Union (IoU) ≥ 0.5 and \( \frac{Intersection}{Ground - truth} \ge \, 0. 9 5 \) which seemed visually acceptable.

4 Detection of the Best Head and Abdominal Planes

Given the best bounding box placement of the sagittal plane, the US volume is cropped in the axial plane to reduce the search space for the best axial biometry planes. We empirically chose the width of the cropped volume based on the box height as shown in Fig. 1.

Fig. 1.
figure 1

Based on the box height, the width of the image in axial plane can be calculated as (4/3) times the height centered at the middle of the image. Using these, we obtain final cropped axial slice.

Once the range of axial slices where the fetus exists is detected, we classify the fetal axial slices into three classes: head, body and non-fetal. We then search for the best biometry head plane among head candidates only and the best abdominal biometry plane among abdominal candidates only. In the following two subsections, we describe these two steps in detail.

4.1 Fetal-Partitioning via Transfer Learning CNNs

In this step, we aim to partition the fetal axial slices into three classes: head, body and non-fetal. It is important that the slices are classified into the correct class so that a good partition can be obtained. We address this task as a classification problem using CNNs [12]. CNNs has advantages that it automatically learns the visual feature descriptors that are invariant to the translation. However, medical imaging applications, such as this work, usually use small datasets due to ethics approval constraints or availability of data only from small clinical trials. Hence, there is no guarantee that a CNN will satisfactorily solve the problem. Therefore, in this work, we propose to use transfer learning [5] to transfer learnt-features from a different pre-trained network and then fine-tune the CNN on our dataset to reduce overfitting. Also, it can reduce the time required to build the base networks. In this paper, we use a pre-trained network trained on a second-trimester fetal US data [13]. In addition, the network in [13] initialized its layers from AlexNet [12]. Since we transfer the learnt features from the second trimester fetal US dataset, one may assert that the extracted features should be useful for our US data. Our network consists of 5 convolutional layers and two fully connected layers that have 4096 neurons. We use max pooling layers to non-linearly downsample the feature maps. The output of the last fully connected layer is fed to a Softmax layer (multinomial logistic regression) which is used as a cost function.

For this task, it is also important that the biometry planes for head and abdomen must be within the predicted ranges for the head and body respectively.

4.2 Detection of Best Plane for Fetal Head and Abdomen Biometry

Based on the partitioned regions of the fetal slices, we use a greedy approach to find the best plane for both the head and abdomen. From the training data, we perform a linear regression to predict the distance of the best head plane from the approximate fetal crown as a function of the length of the head. Similarly, we perform another linear regression for the best abdominal plane from the approximate fetal rump as a function of the length of the body.

During testing, as shown in Fig. 2, we first find the set of axial planes which belongs to the head and body. We then get the distribution of each set to estimate whether the head is located on the left or right side of the image. We then exploit the anatomical constraint that if the head is found to be located to the left, random body slices should not appear on the left side of the image. Using this constraint, image slices located on the left which have been previously classified as body are reassigned to the head class. Based on this result, we then apply linear regression to locate the best plane for the head and abdomen.

Fig. 2.
figure 2

Exploit the anatomical constraint. (a) Acquire the set of axial planes which belong to the head and body. (b) Misclassified slices (red box) are corrected since the head is on the left according to the partitioning result. (c) The final partitioning result. (Color figure online)

5 Experiments

The volumetric US data used in this work was acquired in the same clinical protocol which is used to image the 2D CRL view [1]. This makes our solution consistent with the current clinical workflow in the standard dating scan. During 3D image acquisition, the whole fetus must be present and the fetus has to be in neutral position [1]. The first trimester fetal volumes were obtained from participating mothers in the INTERGROWTH-21st [14]. Image resolution was 0.33 mm3. Images were acquired following a standardized protocol which is similar to the acquisition protocol in the current clinical practice. This ensures that the fetus is facing upward and in a neutral position as mentioned in Sect. 3. We acquired total 64 3D US first trimester volume. The axial slices from a volume which belong to the head, body were manually selected by an expert which were then used to train the proposed solution. The expert also specified the best manual biometry plane for the head and abdomen which we used as ground truth to compare with the automatic biometry plane.

6 Results

6.1 Localization of the Fetus in the Sagittal Plane

As shown in Fig. 3(b), the edge strength is strong between the fetus and amniotic fluid, but it is weak between the fetus and non-fetal tissue. As mentioned in Sect. 3.2, a large number of candidate bounding boxes are generated and by excluding the out-of-range boxes, the number of boxes can be reduced significantly, for example, 1000 to 100 for one test data. We achieved a correct localization rate of 84.4 % using the criterion as mentioned in Sect. 3.2.

Fig. 3.
figure 3

Progress for bounding box around the fetus (a) Original image (b) Edge-mapped image (c) 3 bounding boxes based on the detected edges that received top 3 highest class probabilities (d) Final bounding box by averaging 3 boxes.

6.2 Fetal-Partitioning and Extraction of Biometry Planes

We performed 3-fold cross validation and the mean accuracy of correct classification of each slice using CNNs was 76.9 %. Figure 4 shows visual results of the fetal slice partitioning step. From the partitioning result, it was also found that most of the ground truth biometry planes were within the range of the head and the body. The mean distance between the manually and automatically extracted biometry planes is 1.6 ± 0.2 mm for the fetal head and 3.4 ± 0.4 mm for the fetal abdomen. Some typical results are shown in Fig. 5.

Fig. 4.
figure 4

The partitioning result. (Color figure online)

Fig. 5.
figure 5

Manually selected and automatically selected best planes for the fetal head and fetal abdomen with distances of difference between them

For the demonstration of the whole progress along with the results, please refer to the supplementary material of this paper.

7 Discussion and Conclusion

We have developed a method for automatic localization of the fetus in a 3D US scan and subsequent detection of the best biometry planes for both the fetal head and fetal abdomen in the axial plane. The localization of the whole fetus in the sagittal plane using the bounding box approach alone helps a clinician better visualize the whole fetus but also measure the CRL. We showed a method which partitions the fetal volume into the fetal head and body using CNNs and extracts the biometry planes for fetal head and abdomen using linear regression with promising results. Extending the current prototype into a clinical tool is within our future vision. The current approach uses multiple machine learning methods to select the best biometry planes. We plan to investigate the use of one CNN framework with multiple tasks to tackle all the steps and compare with the current proposed solution.