Keywords

1 Introduction

Computer assisted intervention (i.e. CAI) aims to equip the surgeon with a “surgical cockpit”, where the live position of surgical instruments, preoperative imaging and intraoperative organ position are represented within the same coordinate system. The core process of CAI is the registration, which aims to find a geometrical mapping between the preoperative image and the intraoperative organ position.

Within computer assisted open liver surgery, the preferred method for obtaining the information about intraoperative position of the organ is via a tracked 2D ultrasound (i.e. US) probe (optically or electromagnetically tracked). Since US can visualize underlying anatomical structures (e.g. tumors, hepatic and portal vein), it is widely accepted and integrated into the surgical workflow.

By coupling a 2D US probe with a tracker, intraoperative 3D US volumes can be reconstructed. Similar to conventional tomographic images, hepatic vasculature imaged within these volumes can be automatically segmented, and used for generating a 3D model of the underlying vasculature. Such a model can be registered with its preoperative counterpart (CT or MRI-based). Accuracy of this registration is greatly dependent on the extent and accuracy of the US-based segmentation of liver vasculature.

The majority of previous work on hepatic vasculature segmentation in 2D US are based on conventional segmentation techniques. In [4, 5], edge detection algorithms based on the difference of Gaussians were evaluated in phantom settings. In [11], semi-automated region growing methods were used, while in [14] dynamic texturing combined with k-nearest classification was adopted. Other methods combined extended Kalman filters with constraints on the detection of ellipsoid models [7, 19] or tubular structures [1, 10, 18]. Despite promising results in phantom settings, these methods have proven less successful in clinical settings since they are prone to mislabeling due to their susceptibility to sub-optimal imaging conditions (e.g. artefacts, shadows, air-gaps, vessel abnormalities).

With the advances in deep learning, many convolutional neural network (CNN) based segmentation techniques that outperformed conventional algorithms, were developed for a broad range of clinical application. Similar advances are emerging in hepatic vasculature segmentation from US volumes. For example, in [16], a CNN combined with k-means clustering for hepatic vasculature extraction was proposed. The network, trained on 132 2D US images, contained a significantly smaller number of parameters compared to conventional deep learning networks and reported a segmentation accuracy, expressed as an intersection over union (IoU), of 0.696 [16]. Similar approaches were adopted in [21] and [20], where 2D [21] and 3D U-Nets [20] for hepatic vasculature segmentation were proposed. These studies reported average segmentation accuracies, expressed as Dice, of 0.5 for 2D U-Nets and 0.7 for 3D U-Nets.

In the context of registration, the segmentation results obtained in [21] were used to define a region of interest for a two-step registration procedure based on Covariance Matrix Adaptation Evolution Strategy (CMA-ES) and gradient orientation similarity. While these recent studies [16, 20, 21] have shown promising results, they are limited by a number of factors.

First, the aforementioned segmentation methods do not distinguish between the two major types of hepatic vasculature (i.e. hepatic and portal vein). This results in a single 3D model, where hepatic and portal veins are combined and registered to their preoperative counterpart as a single anatomical structure. Because these vessel trees have different mechanical properties and independent mobility, joined registration may result in a larger local registration error. Additionally, depending on the tumor position with respect to the hepatic vasculature, the preservation of a hepatic or portal branch has different clinical implications and may require different degrees of accuracy. We hypothesize that a more realistic segmentation approach would distinguish between hepatic and portal vein. This will result in two separate 3D models of the hepatic vasculature which can then be registered to their preoperative counterpart independently from each other, aiming for a more accurate registration.

Second, the registration method described in [21] is based on rigid transformation between the preoperative CT and the intraoperative model of the vessels. While this methodology has been proven effective within a restricted area, it does not compensate for organ deformation throughout the entire organ. In this manuscript we will apply a non-rigid registration methodology and evaluate its accuracy in terms of two measures. Clinically, the most relevant measure is the registration accuracy that one can achieve in the tumor lesion of interest. In order to generalize the registration to the whole liver, overall registration accuracy between the vasculature is most important.

Third, previous studies are evaluated over limited clinical datasets, making it challenging to generalize to clinical use.

In this study we present a segmentation-registration pipeline, that is fully trained and validated on intraoperative imaging. By means of deep learning, we are able to fully automate the intraoperative segmentation process, which is then utilized in the automatic registration of vasculature from the pre- and intraoperative imaging.

2 Methods

The automatic non-rigid registration pipeline that is proposed in this work is schematically illustrated in Fig. 1. This pipeline enables integration of information regarding the lesions and their location with respect to the major hepatic vasculature into the surgical environment.

An initial registration is performed by recording the orientation of the US transducer and a one-point translation based on the center of the lesion. Fine registration is based on the vasculature that is present in both imaging modalities. In the preoperative imaging, vasculature is segmented semi-automatically based on the method described in [8] and refined manually. The intraoperative US vasculature is segmented automatically using a reduced filter implementation of the standard 3D U-Net architecture [3].

This architecture is used to train three different deep learning models for separate segmentation purposes; segmentation of all vasculature, solely the hepatic vein and solely the portal vein. The centerlines of both the pre- and intraoperative segmentations are then used for non-rigid registration with the coherent point drift (CPD) algorithm [17].

Following the segmentation process, both the pre- and intraoperative segmentations are resampled to isotropic spacing of 1.1 mm, increasing registration speed significantly whilst still maintaining accuracy. The registration accuracy was computed on the hepatic and portal vein by computing two measures. To measure the overlap of the vasculature, we computed the root mean squared error (RMSE) of the residual distances between the centerlines of the segmented vasculature and its preoperative registered counterpart. To measure the clinical accuracy, we computed the target registration error (TRE) as the Euclidean distance between the center of the lesion, acquired through US and manually segmented, and its preoperative registered counterpart. TRE was computed on 11 patients. For each case three TREs were found (using hepatic, portal, and all vasculature). Subsequently, the lowest TRE between the registrations using hepatic or portal vasculature was compared with the TRE found using the combined vasculature.

Fig. 1.
figure 1

Vasculature is extracted from the preoperative scan (CT or MRI) prior to surgery (top row). During surgery vasculature is extracted from a reconstructed US volume (bottom row). Centerlines from both modalities are used for registration.

2.1 Vascular Segmentation

The 3D U-Net architecture that is used is a NiftyNet [6] Tensorflow implementation similar to Çiçek et al. [3], but the amount of filters in every layer has been reduced to an eighth, to avoid bottlenecks. A learning rate of \(5 \times 10^{-3}\) with Adam optimizer and L1 regularization with \(10^{-5}\) weight decay were used for training on four NVIDIA 1080 GTX GPUs with a batch size of 2. From each mean value normalized volume, 20 \(144 \times 144 \times 96\) voxel patches were sampled and zero-padded with a volume of \(32 \times 32 \times 32\) voxels. Data augmentation consisted of rotation between \(-10^\circ \) and \(10^\circ \), scaling between −10% and 10% and elastic deformation that is similar to [15]. The Dice loss function was used for training of the network until there was no further apparent converging of the validation loss. Segmentation performance is reported by means of the Dice score.

2.2 Gaussian Regularized Non-rigid Registration

The automatically segmented intraoperative vascular model was used for registration by means of CPD. To reduce computational cost, centerlines were extracted from the segmentations based on the method of [12]. Next, the preoperative vasculature model was mimicked as a Gaussian Mixture Model (GMM), while the intraoperative model was treated as observations from the GMM. Unlike diagnostic preoperative imaging, intraoperative US acquisition is a localized high resolution, yet noisy image of local vasculature. Therefore, point clouds of intraoperative centerlines models are fundamentally different from diagnostic imaging. CPD handles noise well and should therefore be robust to registering the complete vascular point cloud (preoperative) to a sub-set of this point cloud (intraoperative) [17].

The CPD implementation of [9] allows for tuning of two variables; \(\alpha \), determining the deformability of the preoperative model, to align with the intraoperative model, and \(\beta \), determining the size of the Gaussian kernel that was used to find the coherent point in the intraoperative model. Both variables were optimized by means of grid search, with values in the ranges of and . The TRE was minimized by grid searching the amount of points that are used for registration in the range of . The optimal combination of settings was 0.3, 550, 8 for \(\alpha \), \(\beta \) and the number of nearest points respectively.

2.3 Data

The complete dataset contained 203 stacked 2D US volumes, of which 106 volumes, acquired in 24 patients, were considered of sufficient quality. In 96 volumes, the hepatic and portal veins were delineated, of which 85 were used in training and validation of the segmentation network. The main reason of exclusion was the incorrect stacking of 2D US slices, either due to rapid turning movements of the US probe by the operator, or due to tracking or reconstruction errors. Patients scheduled for open surgery of age \(\ge \)18, with centrally, primary or secondary, near vasculature located liver lesions from any origin, of diameter < 8 cm were included in the dataset. Preoperative scans used for registration were no older than 2 months.

Volumes were acquired by coupling a T-shaped intraoperative US probe (T-Shaped Intraoperative I14C5T, BK Medical, Herlev Denmark) with an electromagnetic tracking system (Aurora Northern Digital, Ontario, Canada). Calibration between the tracking sensor and the US image was performed using the method described in [2]. CustusX [1] was used for acquiring the tracked images, which were then stacked in a volume using pixel nearest neighbor reconstruction. During acquisition, the US operator was instructed to acquire large volumes following a straight path from segments 4a and 8 to segments 4b and 5. Five different operators acquired the US volumes and each volume was delineated by one out of four annotators using 3D Slicer. Unclear delineations of structures have been validated by a radiologist. Five scans have been delineated by multiple annotators with the aim of setting a manual gold standard, for comparing the automatic segmentation performances. The hepatic and portal veins were segmented separately and volume sizes ranged from \(293 \times 396 \times 526\) to \(404 \times 572 \times 678\) pixels, depending on the zoom of the 2D slices and length of the scanning trajectory, but were down sampled to 40% prior to training, similar to [20]. Pre- and intra-operative data of 11 patients, accounting for 11 scans that contained tumor lesions, were used for evaluation of the registration pipeline.

3 Results

The reduced filter 3D U-Net obtained Dice scores of \(0.77\,\pm \,0.09\), \(0.65\,\pm \,0.25\) and \(0.66\,\pm \,0.13\) for combined vasculature, hepatic and portal veins respectively. These values are comparable to the Dice score of the inter-observer variability (\(0.85\, \pm \, 0.04\), \(0.88\,\pm \,0.02\), \(0.74\, \pm \, 0.12\)) for combined vasculature, hepatic and portal vasculature respectively. Figure 2 shows the segmentation result for a single case, for the different types of vasculature. The majority of mislabelling occurred on the peripheral segments of the vasculature (i.e. small vessels).

Fig. 2.
figure 2

Example of vascular segmentation prior to registration, with (a) all vasculature, (b) hepatic vasculature, (c) portal vasculature, with Dice scores of 0.82, 0.72 and 0.82 respectively. The ground truth delineation is indicated in green and the automatic segmentation in blue. (Color figure online)

Fig. 3.
figure 3

(a) Registration accuracy between the vascular centerlines expressed as RMSE for the combined vasculature, hepatic and portal separately. (b) Registration accuracy of all vasculature vs the minimum between the hepatic and portal vasculature. Dots represent outliers.

Fig. 4.
figure 4

(a) Lesion TRE after registration compared between the individual cases, based on whole, solely hepatic and solely portal segmentation. (b) Lesion TRE relative to distance to vasculature in ground truth segmentations.

The distribution of the RMSE of the registered vasculatures using CPD is summarized in Fig. 3a. On average, the RMSE of the combined vasculature (\(4.4\, \pm \, 3.9\) mm) is lower than those calculated for the hepatic (\(7.0\, \pm \, 7.5\) mm) and portal vein (\(4.8\, \pm \, 4.4\) mm). Nevertheless, Fig. 3a shows a similar RMSE distribution for the combined, hepatic and portal vein registrations. Clinical accuracy, measured as TRE between the tumor position acquired through US and its preoperative registered counterpart, is shown in Fig. 3b. On average, selecting the lowest TRE of the hepatic or portal vasculature (\(7.1\, \pm \, 3.7\) mm) results in a lower TRE compared with the combined vasculature (\(8.9\, \pm \, 5.3\) mm). This can also be seen in Fig. 4a, which compares the obtained TREs for each patient. In 10 out of 11 cases, a lower TRE was found either by using only the hepatic or portal vein. In 9 out of 11 cases TRE was calculated to be below 10 mm (considered the viable clinical threshold). The total computation time of the pipeline is 62 ± 5.37 s, 68.3 ± 6.23 s and 84.5 ± 11.2 s for respectively hepatic, portal and all vasculature. Figure 4b shows a linear correlation coefficient of 0.796 and 0.331 between the distance of the lesion with regards to the vascular tree relative to the TRE, when registration is performed on ground truth and automatic segmentations respectively.

4 Discussion

We have presented a methodology for hepatic vasculature registration that utilizes a deep neural network to segment the hepatic and portal vein from 3D US volumes.

The network was validated over several clinical cases, thus proving the feasibility and robustness of this approach over inter-patient anatomical variations. Whilst the segmentation accuracy of the all vasculature is comparable to previous studies, it is inferior to the inter-observer variability. The largest differences between manual and automatic segmentations are found when segmenting small vasculature. This might be caused by the large class imbalance between the background-foreground (i.e. parenchyma-vessels). On average we found that the vessel-to-parenchyma area ratio is \(2.3\%\) for the hepatic and \(1.7\%\) for the portal vein. This negatively affects the Dice score since it does not fully utilize the spatial information on scales (1 pixel in a smaller volume is more important than in a larger volume) nor does it utilize the structure of the vasculature. In the future we will evaluate different cost functions such as focal loss [13]. Focal loss compensates for class imbalances by penalizing common classes and rewarding hard negative examples.

Registration accuracy, expressed as average RMS of the residual distances between the intraoperative centerlines and their registered counterparts was found to be comparable for all the three cases. Nevertheless, clinical accuracy (i.e. TRE) was found lower when using only hepatic or portal vein. This confirms the validity of a non-rigid registration approach where hepatic vasculatures are segmented and registered independently from each other. Even though the majority of the cases resulted in a TRE below 10 mm, a different aspect could be improved to obtain a more accurate registration. In the future, we plan to combine registration obtained for the two different vasculatures, depending on the tumor proximity to one or the other vessel. This is due to the fact that tumor positions change for each patient and its mechanical and biological properties vary from those of the vasculature. Therefore, a better approximation of the registered tumor position would consider these additional parameters. Other important aspects that influence the registration accuracy are the US scanning process and its reconstruction. The majority of the volumes were acquired in the cranio-caudal direction, starting from segments 4a or 8 and ending at segments 4b or 5. However, within this process, factors such as speed of acquisition, EM interference and regularity of the volume, contributed in the reconstruction accuracy of the underlying vasculature. This high variability in the parameters influencing the US volume, also resulted in large variations in the reconstructed vasculature and therefore registration accuracy.

When the TRE is determined based on ground truth segmentations, there seems to be a correlation between the lesion-to-vessel distance in the preoperative imaging and the TRE, which is not seen when using automatic segmentations. Hence, we argue that segmentation quality contributes even further to upfront prediction of which vasculature to select for the registration. We will implement this selection criteria into our pipeline, allowing us to select the most promising registration first.

Finally, the results show that both the segmentation and registration processes are highly dependent on the quality and quantity of the information contained in the US volume. Factors such as vessel-to-parenchyma ratio, scanning direction, zoom and reconstruction artefacts, strongly influence the outcome of the proposed methodology. Similar findings were also reported in [21], where only US images containing vessel-to-image ratios greater than 1% were selected for registration. In the future we will quantitatively evaluate the impact of these factors on the registration accuracy and develop deep learning methods that aim at automatically evaluating the quality of the acquired US volumes.

In conclusion, we have demonstrated that multi-class segmentation of hepatic vasculature from US volumes is feasible and, when combined with a selective non-rigid registration, accurate registration can be achieved. To our knowledge, this is the first work that utilizes deep learning based segmentation for registration purposes in hepatic ultrasound imaging where hepatic and portal vein are segmented separately. Given the promising results, validated over several patients, we are planning a prospective study in order to integrate this approach within the clinical routine of computer assisted liver interventions.