Introduction

Ultrasound (US) imaging is extensively used in both diagnostic and interventional scenarios. The high accessibility of US systems and their low costs make US the modality of choice to obtain real-time imaging of various anatomies. In the case of soft tissue applications and subcutaneous pathologies, such as soft tissue sarcomas (STS), US yields morphological information about the target tumor [15], e.g., tumor heterogeneity or vascularization. Being free of ionizing radiation, US is also an ideal imaging modality for follow-up acquisitions after surgical interventions, such as in the case of STS where there is a high risk of recurrence of STS [1].

To achieve optimal acoustic coupling of the US transducer and obtain good visibility of the target anatomy, it is necessary to apply a certain force onto the contact surface. This naturally results in an unavoidable deformation of the imaged tissues. For certain clinical applications, the induced deformation can be exploited to infer mechanical properties of the imaged tissue by means of a static or dynamic excitation—a concept exploited in classical US elastography [12]. On the other hand, this deformation can hamper the exact localization of anatomies of interest, such as STS tumor masses, as well as the precise assessment of their geometry and volume. Additionally, as these deformations do not usually remain constant during free-hand 3D ultrasound acquisitions, different deformations across the individual 2D images of an US sweep eventually impair the volumetric reconstruction. Despite the advantages free-hand 3D-US could provide in several clinical settings, in both diagnostic [13] and interventional scenarios [11], these drawbacks impair a higher acceptance of such approaches in practice. To overcome these deformation-induced problems, the inhomogeneous deformation created in free-hand 3D US is reconstructed in [18] through a combination of probe tracking information and non-rigid image-based registration, although limited to the deformation in the axial direction. Similarly, in [2, 21] nonlinear tissue distortions produced during long 3D US acquisitions are corrected via subsequent elastic registrations of image pairs. All these methods, however, do not aim at removing the tissue deformation, but to make it homogeneous along the entire US volume.

Recently, robotic platforms for US imaging (rUS) have been proposed [5, 17]. It is common for these platforms to include force sensing, either via force and torque sensors attached to the robot end-effector, or coupled with elastic joints, such as in [8]. Therefore, contrary to free-hand US, such platforms are able to apply a constant pressure onto the examined anatomy and thus maintain a nearly constant deformation, which significantly alleviates volumetric reconstruction. However, the deformation will still be present due to the necessary contact force. This is of particular interest in an interventional scenario, when registering preoperatively acquired tomographic image data (e.g., MRI or CT) to intra-operatively acquired 3D US data, as the amount of deformation between the two volumes is directly proportional to the required computational effort and thus precious runtime.

Related work

Existing methods for correcting for the induced deformation in US make use of a combination of various types of information such as the US image data, the position of the US transducer (via a tracking system) and the applied force. A group of proposed works additionally employs biomechanical models that reproduce the mechanical properties of the tissue under examination. These include models based on the finite element method (FEM) to predict and correct for the deformation [3]. However, FEM-based methods require a priori knowledge of the tissue properties to obtain an accurate representation and often require specific density, elastic moduli, or similar. In practice, such indices are complex to be retrieved, especially in pathological indications. Besides FEM modeling, 3D mesh models are also used to estimate the deformation [14]. Although these models are not patient-specific itself, the method requires the acquisition of the surface of interest during the procedure, using laser scan technology or similar techniques. Furthermore, both methods assume that deformations are present only along the axial direction of the US transducer. In [4], a proof of concept is proposed to obtain mechanical parameters specific to the examined tissue using mutual information between the US images, avoiding the use of generalized parameters. Finally, a recent method proposed in [16] solely uses tracked US images and the applied force in order to extrapolate the tissue deformation and eventually recover deformations in the axial and lateral direction. However, both methods are presented for 2D images only and validated only on simulations.

Table 1 Comparison with related work

Contributions

We present an image-based approach to obtain compression-free 3D US volumes based on robotic acquisitions (see Fig. 1). We do not employ a technique based on biomechanical models, as these require the retrieval of additional parameters that impairs their usability, especially since a patient-specific model is required. It is noticeable, in fact, that these limitations are reflected in the validation of such methods, which is very often performed on synthetic data only—both regarding tissue models and US images—except [14], which is validated on one clinical case.

The robotic platform presented in the following facilitates accurate tracking of the US transducer position and provides control over the applied force via direct force control techniques. The latter ensures that the necessary force to visualize the anatomy of interested is applied as well as that information about the applied force can be obtained along the whole scan trajectory. With regard to the mentioned state of the art, our proposed method features the following aspects:

  1. (a)

    it makes use of the tracked 2D US images and the force information only, i.e., no additions to a generic robotic US setup are required;

  2. (b)

    it is able to recover deformations in both axial and lateral direction;

  3. (c)

    it uses a novel deformation interpolation (3D inpainting) of sparsely measured elastic information to retrieve a full deformation-corrected 3D US volume.

We provide validation for our method on 30 3D acquisitions performed on volunteers and evaluate the corrected volumes against ground truth US data. Table 1 categorizes this work and the related state of art according to their characteristics. Finally, to allow a better comparison of different methods for deformation correction and to improve reproducibility, the human acquisitions that were performed to validate this work are publicly available.

Fig. 1
figure 1

Proposed workflow for deformation correction of 3D US volumes. For an acquisition of length L, K 2D deformation estimations are performed

Methods

The aim of the presented method is to obtain a 3D US volume free of the deformation induced by the contact force from the US transducer during the acquisition performed by a rUS platform. Following an image-based approach, this requires to correct for the induced deformation of each individual 2D image that forms the 3D volume, which would imply extensive acquisition and computation time depending on the length of the trajectory. To maintain high clinical usability, we propose to perform the 2D deformation estimation sparsely along the planned trajectory, while providing a complete 3D deformation correction employing a novel 3D-inpainting scheme. A schematic representation of the overall workflow is shown in Fig. 1. Initially, 2D US images are acquired along a linear trajectory of length L. During the whole acquisition, the position of the US transducer is tracked via the forward kinematics of the robotic manipulator, as well as the fixed base force \(\mathbf {F_\mathrm{base}}\) applied onto the surface. At each designated position, we estimate the 2D deformation induced by the transducer pressure as described in “2D deformation estimation” section. This estimation is performed at K equidistant locations (separated by a distance of \(L/(K-1)\)). Then, the estimated sparse 2D deformation fields are inpainted using a graph-based representation of the whole US sweep in order to obtain a volumetric deformation field. Using this approach allows for filling in the missing information where the direct estimation was not performed. Finally, the undeformed 3D volume is reconstructed as described in “3D reconstruction” section.

Fig. 2
figure 2

Exemplary force profile and induced deformation. Forces applied during 2D deformation estimation (left), resulting force-dependent deformed US images for 2 N in red, and 8 N in blue (right)

2D deformation estimation

While the transducer is moved along the planned trajectory and the base force \(\mathbf {F_\mathrm{base}}\) is applied, for each of the K locations, the tissue deformation due to the applied pressure is examined. Inspired by the approach proposed by Sun et al. [16], 2D deformation fields are generated from a series of 2D images acquired at the same location with N different forces \(\mathbf {F_{i}} \in [\mathbf {F_\mathrm{start}}, \mathbf {F_\mathrm{end}}]\) where \(i=1,\ldots ,N\). These contact forces are increased by \(\mathbf {F_\mathrm{step}}\) after a small temporal interval \(t_\mathrm{step}\), s.t.

$$\begin{aligned} \mathbf {F_{i}} = \mathbf {F_\mathrm{start}} + (i-1)\mathbf {F_\mathrm{step}}, \end{aligned}$$
(1)

yielding N force-dependent 2D images \(I_i\) (\(i=1,\ldots ,N\)) as illustrated in Fig. 2. We retrieve the resulting deformation between \(I_i\) and \(I_{i+1}\) at each pixel using a preconditioned fluid-elastic diffeomorphic demons as described in [19, 20] with the following parameter settings: standard deviation for elastic demons \(\sigma _e=1\), standard deviation for fluid demons \(\sigma _f=1\) and step size \(\tau =0.05\). The parameters for the diffeomorphic demons algorithm have been chosen to reach a meaningful compromise between speckle size and degree of desired regularity: while increasing both standard deviations decreases the contributions of individual speckle patterns, too small standard deviations do not achieve satisfactory regularization results. In addition to this, it needs to be considered that by adjusting \(\sigma _e\) the regularity of the entire solution, i.e., the deformation field, can be changed and by adjusting \(\sigma _f\) one can influence the regularity of the individual updates, i.e., how susceptible the algorithm is to spurious and noisy local deformations.

Fig. 3
figure 3

Qualitative comparison of model quality. The evaluation shows the position of a single tracked sample for all measured incremental force steps in reference to evaluated models based on first-, second- and third-order polynomial

Fig. 4
figure 4

2D deformation tracking: 5 pixels are tracked—manually and using the demons-based approach—while forces from \(F_\mathrm{start}\) to \(F_\mathrm{end}\) are applied. The points are marked on the first (left) and last (center) frame of the sequence. On the right, their position in axial direction is displayed: manually tracked trajectory (full line) and tracked via demons (dashed line)

To finally allow for a continuous modeling of the expected tissue displacements as a function of the applied force, a regression function is fitted to the sampled points. In our case, we model the force-dependent deformations using fourth-order polynomial functions in order to regress the deformation field corresponding to the force-free configuration, i.e., at zero force. It is important to note that while the method in [16] utilizes median filtering on the obtained deformation fields to guarantee spatial consistency, the proposed reconstruction in “3D reconstruction” section ensures this property in an implicit manner. In conclusion, we obtain a model of the tissue deformation with respect to the applied force. Since each 3D acquisition is performed using \(F_\mathrm{base}\) (e.g., 5 N)—with the exception of the location where the 2D estimation is computed—we will use the regression model to estimate the position at zero force for each volume voxel. Examples of such model obtained by pixel tracking are shown in Figs. 3 and 4.

3D reconstruction

As the 2D deformation estimation takes most of the time during a 3D US acquisition (see “Acquisitions” section) and acquiring them in a dense sampling is impractical, it is performed at K equidistant sampling positions along the scanning trajectory in order to facilitate the proposed method to be applied in a clinical scenario. Thus, the obtained 2D deformation fields have to be propagated or inpainted at the image position where no sampling has been performed. Inpainting can be described as the process of augmenting a set containing missing values with ones based on the surrounding known samples. There are several mathematical approaches to model inpainting, one of which is the solution of a set of partial differential equations (PDEs), where the known values represent the Dirichlet boundary conditions.

It is possible to solve discrete PDE-like problem with a graph-based method as the one proposed by Hennersperger et al. [9], which was introduced for the solution of a quadratic optimization problem. Differently from the original formulation, our inpainting problem requires the generation of new graph nodes to interpolate the missing information. We choose anisotropic diffusion of the deformation fields over the graph structure, such that we can better define the required diffusivity properties for our problem, i.e., high diffusion in the elevational direction and low speed of information propagation in axial and lateral direction. That is, we want the deformations to propagate more over the sweep direction than within the image plane. This can be efficiently implemented over the irregular graph that is built by the method, i.e., the optimization over the graph is performed as local operation. The irregular nature of the graph also allows us to employ non-parallel acquisitions.

For our problem, the boundary conditions (the known values) are the sparse 2D deformation fields. Inpainting is performed over the deformation fields component-wise using a preconditioned conjugate gradient (PCG) with a Jacobi preconditioner as optimizer. Weights inside the images were fixed to \(e^{-3}\) to inhibit blurring within them, while weights between different images were set as described by the original authors [9], applied to the axial and lateral deformation components.

This graph-based inpainting results in a final deformation field that describes how each pixel along the acquired volume is affected by the applied force (forward deformation field). We invert this field to correct for the deformation induced by \(\mathbf {F_\mathrm{base}}\) during the acquisition. Inversion of an explicit field is not possible in the general case, but due to the employed tracking scheme, the resulting field is diffeomorphic, i.e., invertible. It is inverted numerically using an iterative method based on an open-source implementation.Footnote 1 After applying the inverse deformation field to the individual images from a full acquisition, a 3D volume is created using an US reconstruction method. We employed a voxel-based interpolation method, in which for each voxel a scalar is computed as weighted interpolation of the relevant US image samples. This approach is described in [7] for advanced reconstruction using tensors instead of scalar values. The resulting undeformed 3D volume thus appears as acquired using no force.

Hardware setup and experiments

Hardware setup

We make use of a robotic platform for autonomous US acquisitions: the system is composed of a manipulator, a KUKA LBR iiwa R800 (KUKA Roboter GmbH, Augsburg, Germany), controlled using the ROSFootnote 2 framework. The robotic arm is equipped with torque sensors at its joints, which allow for the estimation of the forces and torques applied by (or to) the robot’s end-effector. B-mode US images are acquired using an Ultrasonix RP US machine (BK Ultrasound, Peabody, MA, USA) and a linear transducer (frequency: 3.3 MHz, depth: 55 mm, gain: 50%) which is attached to the robot’s flange. Images are transferred using the OpenIGTLink communication protocolFootnote 3 to a workstation (Intel Core i7, NVIDIA GTX 1080), where they are synchronized to the transducer tracking and force stream.

Acquisitions

3D US volumes were acquired on the thighs of five healthy volunteers (age 26–30, 4 males, 1 female). We selected the volunteers’ thighs as the area for our experimental acquisitions, since extremities are prone to be affected by soft tissue pathologies, such as STS [10], and therefore often subject to US examination. For each volunteer, six volumetric acquisitions of length \(L = 70\,\hbox {mm}\) were performed, for an average of 26 s each and a total of 30 US volumes. Different base forces, applied orthogonally to the contact surface, were employed during the different US sweeps: \(\mathbf {F_\mathrm{base}} \in \lbrace 2,5,8,10,12,15 \rbrace \) Newton. A 2D deformation field, as described in “2D deformation estimation” section, was estimated for \(K = 15\) positions along the planned trajectory, with \(\mathbf {F_\mathrm{start}} = 0\,\hbox {N}\) and \(\mathbf {F_\mathrm{end}} = 20\,\hbox {N}\). \(\mathbf {F_\mathrm{step}}\) was varying between 0.25 and 1 N during a single estimation, since smaller variation steps were found beneficial to better capture deformations of the initial tissue layers at lower forces. That is, those layers tend to undergo high deformations with low forces already, using small steps at the beginning of the evaluation allows to better track pixel movements. The required computation time to compute a 2D deformation field from the images acquired at one location was in average of 186 s, with a maximum memory usage of approximately 21 GB for the computations of a full acquisition.

A ground truth volume, free from any compression from the US transducer, was also acquired for every individual by maintaining the probe at about 5 mm from the contact surface while applying a thick layer of US gel to guarantee acoustic coupling. The precise and stable movement required for the ground truth volume acquisition is made possible by the use of such robotic system. It is important to notice that previous works on the topic do not validate their result using real undeformed US images or volumes, but rather using synthetic ones or via registration with other image modalities. On the other hand, due to the missing contact to the patient surface, while the ground truth volumes do not present deformations, the visibility of the underlying anatomy is strongly impaired.

Fig. 5
figure 5

2D tacking error: Mean and standard deviation of the Euclidean error [mm] between demons-based and manual tracking. 5 points over 5 sequences (25 points in total) of 35 force steps were evaluated, a subsample of the results (10 force values of 35) is shown

Table 2 Quantitative inpainting accuracy: error in mm (average and SD) of an interpolated deformation field with respect to respective estimated one

Validation

To validate the framework for deformation-compensation, we assess the quality of the individual components as well as the overall system:

  • 2D deformation estimation We validate the pixel tracking and deformation regression on 5 deformation sequences. For each, 5 points are manually selected on the first frame and their displacement tracked over the successive ones. The position of the same points is also manually annotated, such that the absolute accuracy of the automatically tracked trajectory can be compared.

  • Deformation field interpolation To evaluate our proposed sparse sampling scheme, we validate how sparsely a 2D deformation estimation can be performed while still obtaining a valuable volumetric deformation correction, as a trade-off between quality and acquisition time is needed. Based on 15 deformation fields estimated per volume, we first compare different subsamplings (leave-one-out) and compute the Euclidean norm of the resulting difference in deformation for 18 volumes. Second, for a given volume, we exclude the computed central deformation field and interpolate the remaining ones to obtain it. We perform this 7 times, incrementally removing more neighboring fields, until only the first and last sample are used for interpolation. The resulting field in the central location of the volume is then compared to the one directly computed by the (ground truth) 2D estimation.

  • 3D volume undeformation We validate the quality of the overall method using the target registration error (TRE) between the compounded US ground truth volume and the final undeformed volumes acquired applying different forces (2, 5, 8 and 15 N).

Results

2D Deformation Estimation

The 2D deformation estimation, as performed along the planned US trajectory (“2D deformation estimation” section), allows to track the displacement induced by the applied force of the individual image pixels. As can be seen in Fig. 3, the computed displacement is characterized by a nonlinear behavior, with a stronger deformation at lower forces due to the compression of the superficial and more elastic tissue layers. Therefore, while in [16] the authors propose to model this displacement using quadratic functions, we instead propose to use fourth-order polynomials to better capture the high flexibility of the subcutaneous and other relevant tissue layers.

In Fig. 4, the 5 points selected for validation of one specific sequence are shown at the beginning and at the end of the deformation sequence together with the resulting models obtained from the demon-based tracking and a ground truth. For the 25 points selected over 5 different sequences, the error between their computed trajectory and the ground truth was found to be \(0.64 \pm 0.57\,\hbox {mm}\). Note that information on the tissue state at 0 N is already sampled in our model, so that we do not need to extrapolate to reach the undeformed state. In Fig. 5, the distribution of modeling errors is depicted for the evaluated force steps. The error from our tracking approach tends to accumulate over multiple force steps, with some sharp increases at instants where tissue layers yield to the increasing pressure.

Fig. 6
figure 6

Qualitative inpainting accuracy: on the left, an exemplary deformation field computed by the 2D deformation estimation is shown. Inpainting is performed with the attempt of reproducing the computed deformation as accurately as possible, while incrementally removing the surrounding neighboring fields to assess how sparse sampling affects output quality. The error magnitudes of the interpolated deformation field with respect to the one on the left are shown in the center (removing 2 neighbors) and on the right (removing 12 neighbors)

Deformation field inpainting

The results of the validation of our inpainting strategy using different subsampling of the available deformation fields are presented in Fig. 7. For displacement fields sampled at a distance of 35 mm, an average error of 1 mm is obtained. The possibility to sample tissue deformation so sparsely also helps reducing the computational costs of a full acquisition and the clinical feasibility of the proposed method. We also validate the accuracy of the inpainting method computing a known displacement field that is not used during interpolation, together with a subset of its neighbors. In Fig. 6, the original deformation field, computed with the 2D estimation, is shown alongside the magnitudes of the Euclidean error between the inpainted fields and the baseline. As also shown in Table 2, the mean error increases with the number of neighboring samples that are removed from the inpainting process, as expected. The error obtained is comparable to that in Fig. 7.

3D volume undeformation

We validate the performance of the proposed deformation correction method using target registration error (TRE) between the compounded ground truth and the undeformed volumes acquired with different forces (2, 5, 8 and 15 N—20 volumes in total). Anatomical landmarks were manually selected with an average of \(142 \pm 14\) fiducial points per volume pair. Table 3 summarizes the measured distances between the chosen points, an increase from \(3.11 \pm 1.55\) to \(6.27 \, \pm \, 2.59\,\hbox {mm}\) can be observed with increasing force. The achieved correction is also shown in Fig. 8.

Fig. 7
figure 7

Deformation subsampling: mean and SD of the Euclidean norm of difference between dense and sparsely sampled deformation estimation

Discussion and conclusion

The obtained results show that the proposed method is able to capture the deformation induced by the US transducer during a 3D robotic acquisition and effectively correct for it. While a direct comparison to the current state of art is difficult due to variations in the acquisition protocols, the reported deformation correction for a clinical case [14] is \(3.5 \pm 0.4\,\hbox {mm}\), which is comparable to our findings for 5 N in Table 3. Such an error would be clinically acceptable for the target application, since diagnosed STS have an average size of 10 cm [6].

To improve the reproducibility of this work and allow future comparative evaluations, we release the acquisitions acquired on volunteers.Footnote 4 The dataset contains synchronized US images, tracking data and force information.

Table 3 Target registration error
Fig. 8
figure 8

Comparison of deformed and undeformed volumes. Axial and lateral views of deformed (left), ground truth (center) and undeformed (right) volumes

It is clear that the overall deformation estimation was able to better correct for low forces, since the estimated deformations from our fitted model are inherently subject to noise due their local nature. Resulting undeformed volumes may contain artifacts at the interface of different tissue layers, as noticeable in Fig. 8, as the tracking of the deformation is more difficult due to the diverse response to the applied force. This effect could be reduced by a better regularization of the obtained deformation fields. It is valuable to note that we do not aim to compensate for all the possible sources of deformations that might be present during an US acquisition, e.g., breathing motion, vascular pulsation, but—similarly to the state of the art—we tackle the deformation caused by the probe pressure only. Future work will include improvements in the model to incorporate constrains on the resulting deformation fields and integrate information on elastic tissue behavior. Additionally, a validation of the method for multimodal volumetric registration would be beneficial to assess its potential in additional clinical settings. Also, a prospective validation on clinical patients would be beneficial to evaluate the deformation on pathological tissue, opening the way to clinical impact of robotic US imaging.