Introduction

Colorectal cancer ranks as the second leading cause of cancer-related deaths worldwide. During the progression of the disease, around 30% of colorectal cancer patients develop colorectal cancer liver metastasis (CLM) [1, 2]. In recent decades, thermal ablation has been incorporated into treatment guidelines as a viable alternative to surgery for selected patients with oligometastatic disease [3]. Thermal ablation offers several key advantages. Firstly, its tissue-sparing nature allows precise targeting of tumors while minimizing damage to the surrounding liver tissue, improving ability to offer salvage therapies at the time of intrahepatic recurrence. Additionally, it can be performed using a minimally invasive percutaneous approach, reducing treatment-related complications, hospitalization time, and healthcare-related costs [4]. Collectively, such characteristics contribute to a significant reduction in morbidity associated with treatment and enhance the overall efficacy of managing CLM [5].

When considering oncological local outcomes, numerous observational studies indicate that achieving ablation margins larger than 5 mm is deemed the most crucial technical factor for ensuring effective local tumor control [6,7,8,9]. Despite the improved accuracy of three-dimensional measurements facilitated by ablation confirmation software packages, the current available methods for assessing ablation margins are highly variable. Most of the existing three-dimensional techniques available for clinical use rely on either rigid image registration (RIR) or intensity-based deformable image registration (DIR) [7, 9]. However, both of these techniques are susceptible to registration errors resulting from liver deformity caused by patient positioning and breathing, placement of the ablation applicator, hydrodissection, and tissue contraction associated with the ablation process. A recent study has proposed a biomechanical DIR method to address deformations for quantifying ablative margins, which has been validated in a retrospective cohort [6, 10]. However, there is a lack of studies comparing different image registration methods in terms of local tumor outcomes. This gap in research may contribute to misinterpretation on the utility of different ablation confirmation methodologies, ultimately resulting in hesitancy to its broader application.

The objective of this study was to compare the predictive performance of minimal ablative margins (MAMs) quantified by biomechanical deformable image registration (DIR) and intensity-based rigid image registration (RIR) methods for predicting local tumor outcomes following colorectal liver metastasis (CLM) thermal ablation.

Materials and methods

Study population

We conducted a single-institution retrospective assessment of patients who underwent CT-guided microwave or radiofrequency ablation for CLM between May 2016 and October 2021. This assessment was performed using data from a liver ablation registry in a single institution (IRB No. PA-15–0566), which adhered to the Health Insurance Portability and Accountability Act and had a waiver of informed consent. In order to be eligible for percutaneous microwave or radiofrequency ablation, patients could have up to five CLMs, each measuring ≤ 5 cm, and no more than three extrahepatic sites of disease (e.g., pulmonary nodules, lymph nodes, or peritoneal nodules). Exclusion criteria for this study included patients who lacked both intraprocedural pre- and final post-ablation contrast-enhanced CT images, as well as tumors that were followed for less than 1 year without local tumor progression (LTP) (Fig. 1).

Fig. 1
figure 1

Participant flowchart

CT-guided ablation procedure and follow-up

All CT-guided percutaneous ablation procedures were carried out by board-certified interventional radiologists. The exact method of CT guided ablation has been previously described [6, 11]. The objective of all procedures was to achieve ablative margins of ≥ 5 mm, which was assessed by comparing pre- and post-ablation contrast-enhanced CT images. This assessment was performed using a two-dimensional anatomic landmarks-based margin visual assessment method, a commercially available intensity-based ablation confirmation software (NEUWAVE™ System, NeuWave Medical), or an investigational ablation confirmation software currently undergoing clinical evaluation (ClinicalTrials.gov identifier: NCT04083378) [12].

Ablation outcomes were assessed according to reporting standards for ablation [13]. Post-ablation contrast-enhanced CT, MRI, or PET examinations were used to assess imaging-related local oncologic outcomes. Residual tumor and LTP definitions were applied accordingly to the criteria described by Ahmed et al [13]. For this particular study, two interventional radiologists (B.C.O. and Y.-M.L., with 14 and 6 years of experience, respectively) independently evaluated all available cross-sectional images. They were blinded to the results of MAM and their assessment determined the oncological outcomes of each ablated tumor. In the event of disagreements regarding the oncological outcomes, a consensus was reached.

Deformable and rigid registration and ablative margin quantification

The intraprocedural pre- and post-ablation contrast-enhanced CT images were uploaded to a radiation therapy treatment planning system, RayStation version 11B DTK (RaySearch Laboratories), for ablative margin quantification, which was developed in-house [10]. The CT images were acquired in the axial plane during the portal venous phase, with an in-plane image resolution ranging from 0.6–1.0 mm × 0.6–1.0 mm, and an image thickness of 3 mm.

Autosegmentation based on custom convolutional neural networks was performed on both the pre- and final post-ablation CT images to contour the liver, target tumor, and ablation zone [14, 15]. Subsequently, two different registration methods were employed to align the pre- and post-ablation CT images. The first method involved intensity-based RIR, utilizing gray-level cross-correlation of the liver contours [10]. The second method utilized a biomechanical DIR approach based on finite element modeling [16, 17], which has been integrated and validated in the treatment planning system.

Following the image registration processes, the target tumor was propagated from the pre-ablation CT images to the post-ablation CT images using the results obtained from the RIR and DIR methods, respectively. No additional adjustment of the registration was performed. The MAM was then computed as the shortest distance between the boundaries of the target tumor and the ablation zone on the post-ablation CT images. The MAM was defined as a nonnegative number, where a measurement of > 0 mm indicated complete overlap between the ablative margin and tumor contours. In cases where the MAM was less than 5 mm, a virtual 5-mm ablative margin was artificially created on the intraprocedural post-ablation CT images [6, 10]. The tissue not covered by the ablation zone within this virtual 5-mm margin was considered tissue at risk for tumor progression. For subcapsular (< 10 mm from the liver edge) or perivascular (< 10 mm from a vessel ≥ 3 mm in diameter) tumors, calculation of the MAM did not include the area abutting the liver capsule or adjacent vessel.

The registrations obtained from both methods were assessed to determine the geometric accuracy of the registration, following the recommendations outlined by the American Association of Physicists in Medicine Task Group 132 [18]. The alignment of the liver contour was evaluated using the Dice similarity coefficient (DSC) [19]. The DSC is calculated by multiplying the volume where the pre- and post-ablation liver contours overlap by 2 and dividing it by the total volume of both contours combined. As the contours converge and agree more closely, the DSC value approaches 1. Conversely, if the volumes diverge, resulting in two nonoverlapping structures, the DSC value approaches 0. Additionally, if the tumor, completely confined within the liver on the pre-ablation image, was found to be located outside the liver boundary after image registration, the volume of tumor outside the liver was recorded as a metric of registration uncertainty.

Statistical analyses

Categorical data were presented as frequencies and percentages, while quantitative data were expressed as means ± SD or medians with interquartile range (IQR) when appropriate. The Wilcoxon signed-rank test was used to detect the difference in paired data and the Mann–Whitney U test was used for ordinal data. The measurement agreement of MAM of two registration methods was evaluated using the Bland–Altman analysis. To evaluate the factors associated with any difference of MAM measurement of two registration methods, univariable and multivariable logistic regression analyses were performed.

To evaluate the performance of MAM, generated by different registration methods, in predicting residual tumor and 1- and 2-year LTP, the area under the receiver operating characteristic curve (AUC) with 95% confidence intervals (CI) was calculated. Subgroup analyses were conducted using different tumor characteristics. The AUC values were compared using the DeLong method. Cumulative incidence functions were utilized to estimate the time to LTP, considering death a competing event. Competing-risks analysis, employing the Fine and Gray subdistribution hazard model, was employed to assess associations between time to LTP and clinical factors, including MAM. Statistical significance was determined with a threshold of p values less than 0.05. All calculations were performed on a per-tumor basis, and the statistical analyses were conducted using the R statistical software (version 4.2.3; The R Foundation).

Results

Study population

A total of 72 patients (mean age of 57 years ± 12 (SD), 44 men) and 139 ablated tumors (mean size of 1.5 cm ± 0.8 (SD)) were included. Patient and tumor characteristics are summarized in Table 1. Over a median follow-up period of 29.4 months (range, 11.7 to 74.0 months), residual tumor rate was 0.7% (1/139) and the LTP rate was 17.4% (24/138). The cumulative incidences of LTP at 12 and 24 months were 12% (95%CI 7, 18) and 14% (95%CI 9, 21), respectively.

Table 1 Patient and tumor characteristics

Image registration and ablative margin quantification

Table 2 presents the results of the registration and ablative margin assessment. There was no significant difference between the median MAM quantified by the DIR and RIR methods. The mean MAM differences between the DIR and RIR methods were –0.2 mm, with a limit of agreement ranging from 3.9 to –4.3 mm (Fig. 2). Subgroup analyses of MAM quantified by DIR versus RIR methods stratified by subcapsular location, perivascular location, and tumor size are demonstrated in Table 3.

Fig. 2
figure 2

Bland–Altman plot of minimal ablative margin quantified by deformable versus rigid registration methods. The mean difference of minimal ablative margin was –0.2 mm, with an agreement between 3.9 and –4.3 mm

Table 2 Registration and minimal ablative margin assessment
Table 3 Subgroup analyses of minimal ablative margin quantified by deformable and rigid registration methods stratified by subcapsular location, perivascular location, and tumor size

The biomechanical DIR yielded a median liver DSC of 0.97 (range, 0.96–0.98), while the RIR resulted in a median DSC of 0.96 (range, 0.67–0.98) (p < 0.001). Only one patient had RIR DSC of < 0.80, who was re-positioned during the procedure. Out of 139 tumors, 27 (19%) were partially or totally mapped outside the liver using the DIR method, while there were 46 (33%) using the RIR method (p < 0.001). All of those tumors were subcapsular tumors. The median percentage of tumor volume mapped outside the liver was 0% (IQR [0, 0]) for DIR, and 0% (IQR [0, 7]) for RIR (p = 0.002) (Fig. 3). Figure 4 displays a representative case.

Fig. 3
figure 3

Violin plot shows the percentages of each tumor volume mapped outside the liver with deformable versus rigid image registration methods (p = 0.002)

Fig. 4
figure 4

Images of a 74-year-old woman who had one 1.7-cm colorectal liver metastasis undergoing CT-guided microwave ablation processed with minimal ablative margin quantification with deformable and rigid registration methods. A Axial contrast-enhanced CT scan obtained before ablation shows segmentation of liver (blue) and target tumor (green). B Axial contrast-enhanced CT scan obtained immediately after ablation with segmentation of liver (blue), ablation zone (orange), and mapped target tumor using deformable (green) and rigid (red) registration methods shows retraction of liver edge. The rigid registration method mapped the target tumor outside liver, which resulted in a falsely larger MAM quantification

The univariable and multivariable logistic regression analyses showed tumor size and perivascular location were two independent factors in predicting any difference of MAM between the two registration methods (Table 4).

Table 4 Univariable and multivariable logistic regression of factors associated with any difference of minimal ablative margin between two registration methods

Local ablation outcomes and minimal ablative margin

The MAM in residual tumors and tumors with LTP ranged from 0 to 3.2 mm using the DIR method and from 0 to 7.6 mm using the RIR method. In terms of predicting residual tumor and 1-year LTP, the AUC was 0.89 (95%CI 0.83, 0.94) for the DIR method and 0.72 (95%CI 0.61, 0.83) for the RIR method (p < 0.001) (Fig. 5). Similarly, the DIR method had higher AUC than RIR in predicting 2-year LTP rates (90% versus 72%; p < 0.001) (Supplementary Fig. S1). When tumors were stratified according to location and size, the AUC of predicting residual tumor and 1-year LTP was higher for the DIR method than the RIR method (Fig. 6).

Fig. 5
figure 5

Receiver operating characteristic curves for predicting residual tumor and 1-year local tumor progression by minimal ablative margin quantified by deformable versus rigid registration methods

Fig. 6
figure 6

Subgroup analyses of receiver operating characteristic curves for predicting residual tumor and 1-year local tumor progression by minimal ablative margin quantified by deformable versus rigid registration methods stratified by subcapsular location (left column), perivascular location (middle column), and tumor size (right column)

In the univariable competing-risks regression model, tumor size, MAM, and the volume of tissue at risk for tumor progression were significant predictors of LTP. After adjusting for tumor size, an MAM of 0 quantified by DIR had the highest subdistribution hazard ratio (SHR) of 9.3 (95%CI 4.1, 20.8; p < 0.001) (Table 5).

Table 5 Subdistribution hazard ratio for factors associated with local tumor progression by competing-risks regression model

Discussion

Accurate and reproducible image registration methods are prerequisites for ablation confirmation software in evaluating ablation completeness. In this study, we found that using biomechanical deformable image registration (DIR) for minimal ablative margin (MAM) quantification was more accurate in liver registration and outperformed the intensity-based rigid image registration (RIR) method in predicting local tumor outcomes in patients with colorectal liver metastasis (CLM) undergoing thermal ablation. This was demonstrated by a higher area under the receiver operating characteristic curve (AUC) (0.90 versus 0.72; p < 0.001) and subdistribution hazard ratio (SHR) (9.3 versus 2.4), regardless of tumor size and location.

Excellent registration accuracy and short processing time are crucial when utilizing ablation confirmation software during clinical ablation procedures [18]. Quick and accurate decisions regarding the necessity of immediate repeated ablation are of utmost importance in such cases. While RIR offers advantages of simple implementation and fast processing, our present study demonstrates that DIR outperformed the RIR method in terms of Dice similarity coefficient (DSC). The DIR method yielded the excellent lowest DSC of 0.96, whereas the RIR method produced the lowest DSC of 0.67, even in CT images acquired using the same scanner and settings within the same procedure. The patient with RIR DSC of 0.67 had a change in positioning during the procedure. According to the American Association of Physicists in Medicine Task Group 132 [18], a DSC range of 0.80 to 0.90 is considered acceptable. This is especially valuable considering that registration of two CT image sets collected at different times can be challenging due to different image levels or deformation of the liver caused by the patient’s breathing, heartbeat, or tissue retraction following ablation, change of position on the CT table, etc. Restricting the registration of two images to simple rigid transformations often leads to remaining uncertainties due to the deformable nature of soft tissue. Although DIR holds potential for mitigating these uncertainties, limitations and challenges persist. In present study, we observed that several subcapsular tumors were partially mapped outside the liver contour, even when the biomechanical DIR method was employed. The DIR algorithm employs a deformation model, which can be over-constrained in some circumstances. For instance, DIR algorithms assume smoothness of the vector field created on liver contours. However, this assumption may result in registration errors when a singularity in the vector field exists with significant local deformation. The large number of degrees of freedom in DIR can also lead to ambiguity in the deformation vector field for certain algorithms, particularly in areas with unsmooth contours. Consequently, registration in these unsmooth areas can be prone to inaccuracies. Nevertheless, in this study, fewer tumors were observed mapped partially outside the liver using the DIR method than the RIR method (19% versus 33%, p < 0.001) and the interquartile range of the volume mapped outside of the liver was smaller for the DIR method than the RIR method ([0,0] versus [0,7], p = 0.002).

Numerous studies have demonstrated that three-dimensional ablation confirmation software outperforms visual comparison of pre- and post-ablation two-dimensional CT images [7,8,9]. However, notable limitations in image registration have been observed. In these studies, patient selection was crucial to minimize registration errors, necessitating the inclusion of patients with similar image characteristics, such as slice thickness, image number, and position. Manual adjustments to image registration were also required in these studies. For instance, a study using rigid registration for ablative margin assessment found 14% of tumors failed at image registration [7]. Another study evaluating ablation confirmation software using intensity-based DIR method reported manual adjustment in 24% of cases [9]. Similarly, two other studies utilizing different registration methods had to exclude 14–16% of cases due to failed registration [7, 20]. In two commercially available ablation confirmation software, additional measurements were necessary to quantify ablative margins in subcapsular regions [8, 9]. These image registration failures limited the use of ablation confirmation software, particularly when deformations occurred during the ablation, such as artificial ascites and liver tissue contraction following ablation.

In our study, biomechanical DIR achieved good accuracy without the need for adjustments. Only 19% (27/139) of tumors were partially or totally mapped outside the liver when DIR method was applied. We believe this is attributable to radical changes in the liver contour resulting from the contraction and dehydration of the ablation zone following the ablation. The DIR algorithm assumes smoothness of the liver contours and leads to registration error in these cases. However, it is arguable that the volume of the ablated tumor was reduced following ablation as well. However, this phenomenon was not reflected in this model. Therefore, the assumed smoothness of the controlled liver contour by the algorithm caused the target tumor to be partially mapped outside the actual liver contour, mimicking the shrinkage of the ablated tumor and providing a corrected ablative margin. Lastly, it is worth noting that biomechanical DIR errors were less frequent in our present study compared to the rigid method. In a total of 86 subcapsular tumors, only 27 (31%) had this phenomenon when DIR was used, compared to 46 (53%) when RIR was used. Further, previous research has shown that biomechanical DIR is more accurate than other DIR methods in liver images [10, 21,22,23,24]. If RIR is used to quantify the ablative margin of subcapsular tumors, caution should be exercised to account for potential registration errors. In this study, we found that smaller tumors and perivascular tumors were associated with larger differences in MAM quantified by DIR and RIR methods. However, this might be explained by the fact that more tumors with a MAM of 0 mm were observed in larger or non-perivascular tumors, which contributed to a smaller influence of registration methods on MAM quantification. These findings did not translate into clinical outcomes when subgroup analyses with stratifications of tumor size and location were applied, which supported using DIR method in any circumstances.

One limitation of this study is its retrospective design conducted within a single institution. At our institution, liver ablation procedures were performed under CT guidance. However, the decision to acquire intraprocedural contrast-enhanced CT scans was at the discretion of the operators. In our cohort, up to 30% of cases did not have complete pre- and post-ablation contrast-enhanced CT images and had to be excluded. This exclusion may introduce a selection bias, potentially limiting the generalizability of our findings. Moreover, concerns regarding contrast-induced nephropathy following contrast-enhanced CT scans may limit its use. Furthermore, we did not compare the biomechanical DIR method with other available ablation software that employ different DIR methods. This comparison is an area that warrants further investigation. Additionally, we did not subjectively assess the accuracy of registration through visual inspection of anatomical landmarks, as it is subjective and susceptible to errors associated with two-dimensional image comparison. A study proposing the use of vessel bifurcation detection in CT scans for automatic and objective assessment of DIR accuracy has been put forth [25].

In conclusion, this study supports the role of biomechanical deformable image registration (DIR) as the preferred image registration method over rigid image registration (RIR) for quantifying minimal ablative margin (MAM) using intraprocedural contrast-enhanced CT images. Therefore, we recommend utilizing biomechanical DIR for ablative margin quantification whenever feasible. However, caution must be exercised when dealing with subcapsular tumors that involve significant liver edge contraction, as complex deformations of this nature may not be effectively addressed by any of the available registration methods.