Introduction

Multi-detector-row computed tomography (MDCT) has become the first-line imaging investigation tool for staging of solid tumours and for the assessment of tumour response after therapy [1]. As tumour size is the main determinant for patient outcome and treatment options, reliable, comparable and precise measurements are mandatory.

In the 1980s, the World Health Organisation (WHO) introduced standard guidelines for a bidimensional measurement for radiologic tumour response evaluation: the product of the maximum axial tumour diameter and its longest perpendicular diameter in the same image [2].

In the 1990s this bidimensional cancer measurement from the WHO was revised by a commission of the European Organisation for Research and Treatment in Oncology, the National Cancer Institute of the United States and the National Cancer Institute of Canada Clinical Trials Group. As a result, the Revised Response Evaluation Criteria in Solid Tumours (RECIST) using unidimensional measurement of the longest axial diameter [3, 4] were determined to be a reliable measurement for quantifying therapy response.

The newest MDCT techniques allow imaging with high spatial resolution and acquisition of 3D datasets. Using these datasets, algorithms have been developed to assist the radiologist in tumour staging. Thus, volumetric quantification has become feasible, leading to a more precise growth assessment and, potentially, a decreased interobserver variability [5]. In the evaluation of lung nodules, for example, previous studies could demonstrate high accuracy [6, 7], reproducibility [8, 9] and decreased interobserver variability [8, 10].

Although volumetry might provide better representation of tumour dimensions than uni- and bidimensional techniques [11], clinical guidelines are still based on uni- or bidimensional measurements, making accurate and reproducible lesion measurements necessary.

Thus, the purpose of our study was to compare semi-automated measurements with manual measurements of RECIST and WHO criteria in hyperdense, hypodense and heterogeneous hepatic metastases on standard MDCT slices.

Materials and methods

Patients

The study was approved by our institutional review board. Seventy-six retrospectively selected patients with known primary cancer likely to metastasize to the liver underwent MDCT from June 2004 to September 2007 at RWTH Aachen University Hospital in the Department of Diagnostic Radiology to detect possible malignant hepatic metastases or for follow-up of known liver metastases. Corresponding to the contrast media enhancement of the hepatic lesions, patients were divided into three groups with hyperdense (n = 35), hypodense (n = 20) and heterogeneous (n = 21) liver metastasis (Table 1).

Table 1 Hepatic lesions

Scan protocol

CT was performed on a 16-slice MDCT (SOMATOM Sensation 16; Siemens Medical Solutions, Forchheim, Germany). All subjects were scanned in craniocaudal direction during inspiratory breath-hold. Acquisition parameters are given in Table 2.

Table 2 CT acquisition parameters

The contrast agents were administered with a power injector (CT2; Medtron, Saarbrücken, Germany).

The scan delay was adapted on an individual basis using the bolus tracking method. The arrival of the contrast material was monitored with a series of dynamic axial low-dose CT images at the level of the abdominal aorta [threshold: 140 Hounsfield units (HU)]. The spiral CT acquisition of the entire liver at the arterial phase was started 5 s after the threshold level; the portal-venous phase started another 45 s after the arterial examination.

Data analysis

CT evaluation by radiologists

Up to five sharp and well-defined hepatic target lesions—according to RECIST standards—were chosen, including hyperdense lesions in the arterial phase and hypodense and heterogeneous lesions in the portal-venous phase. The following were measured by an experienced radiologist to assess RECIST und WHO criteria (Fig. 1):

R1 :

Longest axial diameter

R2 :

Longest diameter perpendicular to R1 in the same image

R3 :

Product of R1 and R2 (R1 × R2)

Fig. 1
figure 1

Axial CT with measurement of the longest axial diameter (RECIST) and the longest perpendicular diameter (WHO) in a hypodense hepatic metastasis of a patient with colorectal carcinoma

CT evaluation by the software

For semi-automated evaluation using the software Syngo Oncology (Siemens Medical Solutions, Forchheim, Germany), patient data were transferred to the software through the network. Semi-automated measurements are started with a directed mouse click into the lesion in axial, coronal or sagittal view. Software results are presented graphically to the user. Manual editing would be possible but was not allowed in our study. In keeping with the manual measurements, semi-automated measurements were named as follows (Fig. 2):

O1 :

Longest axial diameter

O2 :

Longest diameter perpendicular to O1 in the same image

O3 :

Product of O1 and O2 (O1 × O2)

Fig. 2
figure 2

Screenshot applying the Oncology software tool to the same patient represented in Fig. 1

In principle, the software is based on an extended version of the lung lesion segmentation approach described in [12]. The density of lung lesions and surrounding lung parenchyma is well known and can be used for defining fixed thresholds by using, e.g., a region-growing approach. However, the density of liver lesions and liver parenchyma varies significantly depending on such factors as contrast agent timing, cancer type and scan parameters. To account for this, the following algorithm was developed:

  1. 1.

    Initialization: User draws a rough diameter across the lesion in one image plane. For smaller lesions, a single click into the lesion is also sufficient.

  2. 2.

    The algorithm estimates a volume of interest (VOI) around the lesion based on user click point(s).

  3. 3.

    The algorithm estimates thresholds for initial segmentation of the lesion by histogram analysis within the VOI and around the click-point position.

  4. 4.

    3D region-growing based on the calculated thresholds yields initial segmentation.

  5. 5.

    Adjacent structures of similar density are separated by morphological operations (erosion, dilation and combination with region-growing mask)

  6. 6.

    Plausibility check between resulting segmentation mask and initial rough diameter or click point is performed; adjustment of initial thresholds and recalculation of steps 3-5 are performed if necessary.

A more detailed description of the algorithm can be found in [13].

Statistical analysis

Measured values were summarised as minimum, maximum, arithmetic mean and corresponding standard deviation.

The concordance between manual and semi-automated measurements was graphically represented in Bland-Altman plots [14] (selected plots are shown). Moreover, the degree of agreement between manual and semi-automated measurements was assessed by calculating concordance correlation coefficients [15] with corresponding 95% confidence intervals. Values of the concordance correlation coefficient can range between -1 (perfect disconcordance) and +1 (perfect concordance). Agreement analysis was conducted for the parameters “longest axial diameter” (R1/O1), “corresponding longest perpendicular diameter in the same image” (R2/O2) and the product of both (R1 × R2/ O1 × O2) in hyperdense, hypodense and heterogeneous metastases of the liver.

Results

In 76 patients, 165 target lesions were found and evaluated first by the radiologist and afterwards with the software tool. All 165 lesions were correctly segmented with the software tool.

Hyperdense lesions

Longest axial diameter (RECIST)

The mean longest axial diameter of all 52 hyperdense hepatic lesions was 19.0 mm for manual measurement (R1) with a standard deviation (SD) of 11.0 mm. For semi-automated assessment (O1), the mean longest axial diameter was 21.4 mm with a standard deviation of 10.6 mm. The value of the concordance correlation coefficient was 0.93 (Table 3, Figs. 3, 4).

Table 3 Agreement in hyperdense hepatic lesions (n = 52)
Fig. 3
figure 3

Screenshot applying the Oncology software tool to a hyperdense hepatic metastasis

Fig. 4
figure 4

Bland-Altman plot visualising degree of agreement between manual and semi-automated measurement of RECIST criteria in hyperdense hepatic metastasis. R 1 Manual measurement of the longest axial diameter, O 1 semi-automated measurement of the longest axial diameter by the software tool Oncology

Longest diameter perpendicular to RECIST in the same image

Mean values of this parameter were 14.1 mm (R2) (SD: 7.6 mm) and 15.2 mm (O2) (SD: 7.6 mm). The value of the concordance correlation coefficient between the two assessment methods for the maximum diameter perpendicular to the RECIST diameter for hyperdense hepatic lesions was 0.94.

Product of RECIST and its longest perpendicular diameter (WHO)

WHO area measured by the radiologist (R3) showed a mean value of 342.3 mm2 (SD: 401.6 mm2). The mean of the semi-automated measurements (O3) was 396.0 mm2 (SD: 425.9 mm2). The value of the concordance correlation coefficient between the two was 0.95.

Hypodense lesions

Longest axial diameter (RECIST)

Mean values of the RECIST diameter in the 57 hypodense liver metastases were 17.6 mm (SD: 9.7 mm) by manual assessment (R1) and 19.3 mm (SD: 9.4 mm) (O1) with the software tool. The value of the concordance correlation coefficient between manual evaluation and software assessment was 0.95 (Table 4, Fig. 5).

Table 4 Agreement in hypodense hepatic lesions (n = 57)
Fig. 5
figure 5

Bland-Altman plot visualising degree of agreement between manual and semi-automated measurement of RECIST criteria in hypodense hepatic metastasis. R 1 Manual measurement of the longest axial diameter, O 1 semi-automated measurement of the longest axial diameter by the software tool Oncology

Longest diameter perpendicular to RECIST in the same image

Radiologist measurements of the longest perpendicular diameter (R2) showed a mean of 14.1 mm (SD: 8.0 mm), and semi-automated assessment (O2) showed a mean of 14.8 mm (SD: 7.6 mm). The value of the concordance correlation coefficient was 0.93.

Product of RECIST and its longest perpendicular diameter (WHO)

WHO analysis resulted in means of 322.0 mm2 (SD: 342.0 mm2) for manual assessment (R3) and 350.4 mm2 (SD: 348.4 mm2) for semi-automated quantification (O3). The value of the concordance correlation coefficient between the two was 0.98.

Heterogeneous lesions

Longest axial diameter (RECIST)

Mean longest axial diameter of all 56 heterogeneous lesions of the liver was 35.2 mm (SD: 18.9 mm) (R1) when measured manually and 33.3 mm (SD: 19.9 mm) when using the semi-automated measurement technique (O1). The value of the concordance correlation coefficient was 0.94 (Table 5, Figs. 6, 7).

Table 5 Agreement in heterogeneous hepatic lesions (n = 56)
Fig. 6
figure 6

Screenshot applying the Oncology software tool to a heterogeneous hepatic metastasis

Fig. 7
figure 7

Bland-Altman plot visualising degree of agreement between manual and semi-automated measurement of RECIST criteria in heterogeneous hepatic metastasis. R 1 Manual measurement of the longest axial diameter, O 1 semi-automated measurement of the longest axial diameter by the software tool Oncology

Longest diameter perpendicular to RECIST in the same image

Mean values of this parameter were 28.5 mm (SD: 15.5 mm) (R2) and 24.1 mm (SD: 15.0 mm) (O2). The value of the concordance correlation coefficient between manual and software assessments was 0.88.

Product of RECIST and its longest perpendicular diameter (WHO)

WHO areas showed mean values of 1,258.9 mm2 (SD: 1,467.9 mm2) by radiologist measurement (R3) and of 1,078.3 mm2 (SD: 1,423.8 mm2) by the software tool Oncology (O3). The value of the concordance correlation coefficient between the two was 0.93.

Discussion

Metastases are the most common type of malignancy in the noncirrhotic liver. Metastatic involvement of the liver can occur with many neoplasms including melanoma, breast cancer, colon cancer, carcinoid, small-cell carcinoma and pancreatic carcinoma [16]. Accurate assessment of metastatic liver disease is important for treatment planning [17] and to assess therapy response during or after treatment. Contrast-enhanced MDCT offers a sensitivity of detection from 70 to 85% [18]. Despite the fact that MDCT only assesses morphological parameters, CT has become the first-line imaging investigation tool for staging and therapy response control, due to high availability and inexpensive use in comparison to MR.

Although on a per-patient basis, fluorodeoxyglucose positron emission tomography (FDG PET) was found to be most accurate, treatment policy depends not only on distinguishing patients with liver metastases from patients without liver metastases but also depends mainly on determining the number, size, location and surgical margin of the liver metastases [18]. PET-CT has high sensitivity and specificity for the presence of liver metastases and should be included early in initial pre-surgical evaluation in patients with colorectal cancer [19], but again MDCT remains the first-line investigation tool due to its high availability and cheap use. In the future, tumour perfusion may provide additional information on tumour metabolism [20, 21], but until then tumour size measurement and measurements of size changes to determine growth remain most relevant in assessing clinical guidelines. Thus, accurate measurement of the size of hepatic metastases is mandatory to evaluate primary stage and possible post-therapeutical follow-up/tumour response.

As shown by Wormanns et al. [8], manual measurement may lead to remarkable inter- and intraobserver differences in tumour quantity. Thiesse et al., who evaluated the impact of an evaluation committee on patients’ overall response status in a large multicentre trial, discovered that major disagreements occurred in 40% and minor disagreements in 10.5% of the reviewed files. Reasons for disagreements included errors in tumour measurement, errors in selection of measurable targets, intercurrent diseases and radiologic technical problems [22]. Through reliable measurements of hepatic lesions, semi-automated quantification might improve manual performance.

In this context, Marten et al. compared relative values of manual unidimensional measurements and automated volumetry for longitudinal treatment response assessment in patients with pulmonary metastases [5]. Relative measurement errors in this study were significantly higher for manual unidimensional measurements than for automated volumetry. Thus better reproducibility of response evaluation with automated volumetry was shown for pulmonary metastasis and should be preferred to manual unidimensional measurements. Although many studies have argued in favour of automated volumetry of pulmonary nodules [510], fewer studies are found in the literature addressing automatic tumour quantification for other neoplasms, especially the liver.

In one of the earliest studies, De Vriendt et al. [23] tested the feasibility of semi-automated quantification in liver lesions. Their clinical trial showed results which assumed, on one hand, that semi-automated quantitative measurements were possible in daily clinical practice, but at the same time, the authors demanded further investigation to test the clinical usefulness.

Zhao et al. [24] showed high accuracy in developing a shape-constraint region-growing algorithm to automatically delineate liver metastases on computed tomography images. Similar to our results, they showed comparable results for semi-automated measurements in comparison with manual uni- and bidimensional measurements. Heussel et al. [25] compared quantitative malignant liver tumour response between RECIST/WHO criteria and volumetry using semi-automated measurement techniques in an intraindividual CT study and concluded that only volumetric quantification in thin sections leads to reliable evaluation of tumour development. Hence, the main intent for valid and reliable tumour quantification when assessing tumour response is to decrease intra- and interobserver variability that is associated with manual measurements. To achieve this, (semi-)automated tumour measurement should be a feasible approach.

To the best of our knowledge, no other study has chosen a comparable approach using standard reconstructed 5-mm slices MDCT (at 4-mm increments) for measurement of hyperdense hepatic metastases in the arterial phase and hypodense and heterogeneous liver lesions in the portal-venous contrast-material phase and comparing manual measurement to semi-automated segmentation. Using the two well-established parameters (WHO [3] and RECIST [4]), we could show good comparability of both measurement techniques with a very high concordance for all three measured types of malignant liver lesions.

It should be noted that every lesion of any density in our study could be quantified precisely by the software tool on standard axial slice MDCT without further thin-sliced reconstructions, simplifying the clinical routine.

Limitations

Only one radiologist performed manual measurements; however, all measurements were checked by another radiologist afterwards, so we assume no major differences would be observed by independent multiple measurements by additional radiologists. Because this was a retrospective evaluation, the relevance of our study may not be as meaningful as measurements of a prospective study, but we tried to simulate prospective conditions by performing manual quantification first and semi-automated evaluation of the same liver lesion after. Only well-defined hyperdense, hypodense and heterogeneous hepatic metastases were chosen, but in routine scanning, ill-defined lesions are also common. Further investigations are necessary to assess the accuracy of semi-automated quantification of these kinds of lesions.

In conclusion, semi-automated measurement of well-defined liver lesions revealed feasible results on standard axial reconstructions in comparison to manual quantification.