Keywords

1 Introduction

The reconstruction of anatomical joint surface and angular relationships is a paramount aspect in surgical management of fractures or ligament injuries. Intra-operative fluoroscopic guidance, 3D imaging, or navigation is typically used to ensure anatomically and mechanically correct reduction, so that irregular joint loading and complications caused by aberrant biomechanics can be alleviated or avoided. Moreover, for technically demanding procedures, a pre-operative planning sketch is obligatory and helps the surgeon to achieve operational safety [4]. In many of these planning and verification steps, the bone axis serves as an important reference line (Fig. 1). While planning such axes can be easily done on pre-operative static data, doing so consistently on live images during surgery is inherently more complex due to motion and a limited field of view. In addition, non-sterile interaction with a planning software is unwanted. For this reason, axial alignment is typically verified by visual inspection and use of hardware-based solutions such as the cable method, alignment rods, goniometers, or optical navigation amongst others [12, 13, 20]. However, these methods either increase task complexity, are inherently imprecise, or require an open reduction or additional incisions regardless of the surgical technique used. To this end, several methods were proposed to automate detection of the bone axis on image data. Tian et al. [18] compute the femoral shaft axis by using a combination of contour extraction and analysis of intersecting line normals to the shaft contour. They recover the contour by using Canny edge detection and identify the relevant straight line sections with Hough transformation and active contour mode via Gradient Vector Flow. While this approach can deal with truncated bones, it prerequisites the bone to be oriented in an upward position on the X-ray image to isolate the relevant intersection points. Donnelley et al. [3] use a scale-space approach and approximate line straight parameters via Hough transformation. To deal with ambiguous peak spread in the dual space encountered in real-world radiographs, this methods relies on prior spread quantification which falls short in the case of truncated bones. Subburaj et al. [17] use a 3D-reconstructed bone model from pre-operative CT scans. They combine geometrically detected landmarks and maximal inscribed sphere fitting to detect the medial axis, which is then used for identification of anatomical and mechanical axes. Although very accurate results can be achieved, such 3D information is oftentimes not available and requires registration with the intra-operative 2D image.

Fig. 1.
figure 1

Examples for using the shaft axis of long bones as reference line.

To circumvent these limitations, we propose a simple and clinically motivated image-guided approach for detection of the anatomical axis of long bones on 2D X-ray images. We translate the established two-line/two-circle manual method [6,7,8, 10] to a learning based extraction of anatomical features and subsequent geometric construction based on segmentation of the bone cortex outline. With reference to [9], region of interest (ROI) encoding of the relevant contour sections is used to cope with variability in image truncation and arbitrary image rotation. Moreover, the segmentation results can directly be used for registration of the detected axis on fluoroscopic live images. The method is evaluated for the femur and tibia in the knee joint, which are amongst the most prominent anatomies treated in trauma surgery. The reliability of the proposed method is evaluated and confirmed in an inter-rater study with three expert trauma surgeons.

2 Methods

The anatomical axis of long bones in a 2D image plane can be described by two auxiliary lines that follow the orientation of the anterior/posterior or medial/lateral contour of the bone shaft. In contrast to conventional radiographs with rather standardized imaging, this shaft area is usually truncated on intra-operative images due to a limited field of view and a joint-centered acquisition protocol. Furthermore, the largely linear shaft contour can suffer from structural changes due to e.g. bony proliferation. To this end, first the relevant contour sections are estimated and extracted from the image. Subsequently, these sections are masked based on positional probability and smoothed to reduce the influence of outliers. Lastly, the clinically motivated two-line method is used to calculate the bone axis.

2.1 Likelihood Encoding of Relevant Contour Regions

Given a binary bone segmentation mask S we extract the complete cortex contour \(\mathcal {K}\) by using a morphological erosion operation. With a cross-shaped \(3 \times 3\) structuring element \(X = \{(-1,0),(0,-1),(0,0),(0,1),(1,0)\}\) this equates to

$$\begin{aligned} \mathcal {K} = \mathrm {XOR}\left( S,\,\mathrm {erode}(S,X) \right) . \end{aligned}$$
(1)

To constrain the relevant contour section, a ROI similar to [9] is constructed (Fig. 1). Its bounds are defined by the start and end points of an additional line segment. Positional variance both in the parallel as well as in the orthogonal direction to this line segment is encoded by a 2D Gaussian distribution with a standard deviation of \(\sigma =6\,\mathrm {px}\) and truncation bounds at \(3\sigma \). This gives us a symmetrical fall-off in probability orthogonal to the line within a margin of \(37\,\mathrm {px}\). This spatial likelihood distribution is used to decide whether a contour point should be considered part of the relevant contour region. Since we can assume a mainly linear contour, we argue that using a threshold at \(1\sigma \) retains the most probable points while eliminating most outliers (Fig. 2a).

Fig. 2.
figure 2

Implementation of the two line method for bone axis estimation based on the extracted segmentation contour.

2.2 Axis Construction with Two-Line Method

The auxiliary contour extension lines are obtained by fitting two linear functions to the pair of relevant contour regions. Since we cannot assume a designated dependent variable due to unknown image rotation, major axis regressionFootnote 1 is used [21]. Given these two lines, we can now perform a geometric construction of the in-between axis based on the midpoints of two parallel and intersecting line segments. This method is known as the two-line method and is a clinically known and trusted procedure especially in pre-operative manual planning [6,7,8, 10]. First, a line segment is parametrized for each contour line which is bounded by the relevant contour region. One of these segments is subdivided by two points at distance \(\mathrm {d}_1\) and \(\mathrm {d_2}\) from the respective start and end points (Fig. 2a). The actual distances can be selected depending on the target anatomical structure to facilitate easier correction by the user. In a second step, these intermediary points are then projected onto the opposing line segment (Fig. 2b). This procedure allows for different orientation and length of each segment and is close to clinical practice.

2.3 Neural Network Architecture

The proposed construction relies on a segmentation mask of the target long bone and ROI encodings for both relevant contour regions. For combined prediction we use a multi-task variant of the hourglass network architecture by Newell et al. [9, 14]. This network architecture allows to optimize a joint representation of both tasks and benefits execution time and computational footprint upon inference. We separate segmentation and prediction of ROIs into two tasks. The segmentation task is trained with binary cross entropy to delineate the target bone (foreground) and all other image content (background). The ROIs are optimized by direct matching of the pixel intensity values with a mean squared error loss. In addition, we employ gradient normalization [1, 9] to cope with different loss function characteristics and task difficulties. To limit the hardware requirements in consideration of the intra-operative application area, we refrain from a stacked network variant.

2.4 Data and Evaluation

Network training and geometric construction were evaluated for the femur and tibia on a dataset of 221 clinical X-ray images of the knee joint. Each image was acquired as a lateral standard projection where the outlines of both femoral condyles are aligned. The ground truth segmentation masks and line segments representing the ROIs were annotated by a medically-trained engineer with the labelme annotation tool [19]. Our experiment and evaluation setup on this dataset was two-fold.

  1. 1.

    Training of the network in three configurations (a) femur only, (b) tibia only, (c) joint training of femur and tibia, followed by a quantitative evaluation of the performance. For variant (c) the number of output channels for each task head of the network was increased accordingly.

  2. 2.

    Assessment of clinical reliability of the automatic axis detection in an inter-rater study. To this end, three expert trauma surgeons (one site) were asked to annotate the femoral and tibial axes on all 38 evaluation images via two axis control points.

For both experiment series a hold-out test set of 38 images with a \(3\,\mathrm {mm}\) calibration sphere was defined. Representative variability in bone truncation and absolute joint rotation was confirmed. The remaining data was split into training and validation subsets of 167/16 images respectively. The data was split up in such a way that disjoint patient groups are ensured in the training/validation and test datasets. Optimization for the first experiment step was performed using Stochastic Gradient Descent (SGD) with a batch size of 2 over 300 epochs on a NVIDIA TITAN RTX graphics card in the PyTorch (v.1.2) Deep Learning framework. We used a learning rate of \(2.5e{-}4\) which we halved every 50 epochs. To aid generalization and to prevent early overfitting, we applied L2 weight decay with a factor of \(5e{-}5\) and a basic online augmentation sequence during training. This sequence comprised affine transformations (scaling, rotation, shearing, horizontal flipping) and margin crops of random strength. Upon propagation in the network, min-max normalization to the interval of [0, 1] was applied and the image resolution was standardized to \(256 \times 256\,\mathrm {px}\) by resizing and a subsequent center-crop. All reported results are based on the respective model parameters for which the minimum combined task error on the validation split was observed.

Table 1. Evaluation of segmentation performance for the femur and tibia (DICE = Sørensen–Dice coefficient; ASD = Average Surface Distance; HD = Hausdorff Distance).
Table 2. Angulation and displacement error for the femur and tibia in single/combined anatomy training. Results are reported for the anterior/posterior auxiliary lines and bone shaft axis. The displacement error is constructed as the mean orthogonal point-to-line distance of predicted points \(s_1/t_1\), \(s_2/t_2\), \(m_1/m_2\) onto the respective ground truth axis and combines translation and angulation error components. The best results for each axis are marked in bold (\(\mathrm {CI}_{95}\) = 95% confidence interval).

3 Results and Discussion

Bone Segmentation. The results for bone segmentation by the multi-task neural network variants are given in Table 1. In general we observe segmentation results which closely resemble the annotated ground truth. Despite missing annotations of other bony structures in the knee joint, the single-anatomy model is capable of delineating the target bone from other structures, even in ambiguous overlap areas. On the other hand, prediction quality of the combined variant does not suffer from a doubling of inference tasks which benefits fast execution time and a smaller computational footprint. A very low contour error indicates that the networks do not only learn the global shape but also successfully capture small details which are often caused by bony erosion and proliferation. This allows for marginal error propagation into geometric axis construction. Segmentation outliers indicated by higher Hausdorff error points are exclusively caused by the inserted measuring spheres which are not represented in the training data.

Table 3. Comparison of automatically detected shaft axis (Auto) to the annotation of three expert readers (E-1, E-2, E-3) and assessment of inter-rater variability. Due to missing midpoints \(m_1\) and \(m_2\) in the expert reader annotations, the respective displacement error is based on the two annotated control points. Here, \(\mapsto \) denotes a mapping of the \(1^{\mathrm {st}}\) rater’s control points on the predicted axis of the \(2^{\mathrm {nd}}\) rater. marks a mapping in reverse order.

Axes Detection. The performance of the proposed geometric axis construction is presented in Table 2. We observe an average angulation error of less than \(0.65^{\circ }\) for the anterior and posterior auxiliary lines on both bones and only minor differences between single and combined training. This indicates that the predicted ROIs can provide masking of relevant contour sections on a sufficiently fine scale. We can also qualitatively confirm that the likelihood distribution follows along the actual anatomical contour, albeit this area is only approximated by a straight line in the ground truth annotations. These observations strengthen our assumption that we can retain all relevant contour points by masking at a likelihood threshold of \(1\sigma \). In addition, low values for the displacement error (Table 2) indicate minor deviation of the line’s shift off the ground truth bone contour. The constructed bone axes generally benefit from the combined training variant and exhibit a comparatively lower maximum error bound (Table 2). Furthermore, it can be observed that by training both anatomies together, the respective confidence intervals taper off and follow the downward shift of the position measure. Based on these results, we chose the combined network for evaluation in the inter-rater study.

Inter-rater Comparison. The reliability of our method in comparison to expert rater annotations is analyzed in Table 3. For the femur, low angulation and displacement errors indicate reliable axis estimates which are independent of the amount of truncation and rotation present in the image data. A significantly higher angular deviation of the tibial axis can be explained by comparatively more divergent contour lines. Together with structural variation of the anterior tibia (tibial tuberosity), this leads to higher complexity and differences in the individual approach to manual annotation. This reasoning is strengthened by comparison with rater E-3 for whom a systematically more posterior position and orientation can be observed. If compared to the differences between expert raters (Table 3), the automatic approach yields very comparable performance and achieves axis predictions that lie within the inter-rater error bounds. It should be noted that agreement between raters could be further increased if a dedicated tool for semi-automatic two-line planning is used.

4 Conclusion

This study investigated a method for automatic detection of the shaft axis on long bone X-rays. The experiments reveal encouraging results which match expert rater performance. A major strength of the proposed method is the flexibility of ROI masking which we use to select relevant sections of the bone contour without strong prerequisites on image truncation and rotation. We see limitations in that no evaluation was performed for bones that suffer from increased antecurvation/recurvation (e.g. due to natural deformity or increased weight bearing) or major occlusion of the contour by surgical implants. In addition, future work should analyze potential extensions to our method to promote axis estimation in cases of multi-fragment fractures.

Disclaimer. The methods and information presented here are based on research and are not commercially available.