ANALYSIS OF CONTOURS AND SIGNATURES

Shape Analysis of Breast Tumors

Breast tumors and masses appear in mammograms with different shape characteristics: malignant tumors usually have rough, spiculated, or microlobulated contours, whereas benign masses commonly have smooth, round, oval, or macrolobulated contours.1,2 Measures that can quantitatively represent shape roughness and complexity can assist in the classification of malignant tumors and benign masses.3,4 Objective features of shape complexity such as compactness, fractional concavity (F cc), spiculation index (SI), a Fourier-descriptor-based factor (FF), fractal dimension (FD), moments, chord-length statistics, and wavelet transform modulus-maxima have been developed to distinguish benign masses from malignant tumors using pattern recognition methods for computer-aided diagnosis (CADx) of breast cancer.310

However, atypical cases of macrolobulated or spiculated benign masses, as well as microlobulated or well-circumscribed malignant tumors, create difficulties in pattern classification.3,4 Regardless, in comparative analysis of several features of shape, edge-sharpness, and texture for the classification of breast masses and tumors, shape factors such as F cc, FF, and SI have been observed to lead to higher classification accuracies than measures related to texture and density variation.5,7,11

Notwithstanding the relative success of measures of shape in the classification of breast tumors and masses, obtaining precise and artifact-free boundaries of masses from mammograms remains to be a difficult problem.7,12,13 Computer-detected contours may be expected to contain inaccuracies and artifacts due to the limitations of the procedures for the detection and segmentation of masses in mammograms; contours of masses drawn manually on mammograms by radiologists may contain noise related to hand tremor.

In this work, we propose a novel approach to obtain signatures of contours based on the turning angle function that takes into account reduction of noise and artifacts. We also propose the development of methods based on the signature to extract shape descriptors with the aim of classifying the contours of breast masses and tumors.14 Different from our previous work15 where the turning angle function was used to derive a polygonal model of a given contour, the signature of the contour being proposed in the present work, while retaining relevant characteristics of the contour, does not facilitate the derivation of a polygonal model of the original contour. In other related works,16,17 we have proposed a different procedure to derive a polygonal model of a given contour that preserves spicules and details of diagnostic importance. We have also proposed a method to derive a spiculation index from the turning angle function obtained from a polygonal model of a given contour.16,17 In the present work, we present new methods to derive shape factors that represent the presence of convex and concave regions in the contour, to compute an index of convexity, and to estimate the fractal dimension obtained from the signature derived from the turning angle function of the original contour.

To evaluate the performance of the proposed shape descriptors in terms of the efficiency in the classification of breast masses, we compare the results with those provided by fractional concavity (F cc) using the method of Rangayyan et al.4 and FD as obtained by Rangayyan and Nguyen8 in terms of the area A z under the receiver operating characteristics (ROC) curve.

Signatures of Contours

Signatures of contours may be used to analyze their shapes. The most commonly used method to transform a two-dimensional (2D) contour into a one-dimensional (1D) signature is in terms of the radial distance from each contour point to the centroid of the contour expressed as a function of the index of the contour point. Given a contour with N points {x(n), y(n)}, n = 1,2,...,N, the signature S d(n) is defined as \( S_{\operatorname{d} } {\left( n \right)} = {\sqrt {{\left[ {x{\left( n \right)} - \overline{x} } \right]}^{2} + {\left[ {y{\left( n \right)} - \overline{y} } \right]}^{2} } } \). Here, \( {\left( {\overline{x} ,\overline{y} } \right)} \) is the centroid of the contour, with the coordinates given by the averages of the corresponding coordinates of all of the contour points. A benign mass that is round or macrolobulated will have a smooth signature; on the contrary, a malignant tumor that is spiculated or microlobulated will have a rough and jagged signature.18 The 1D signature of a contour as above may be used to derive the fractal dimension (FD) to represent the complexity of the contour.8

Another type of signature S d(n) may be defined as the complex number \( S_{\operatorname{d} } {\left( n \right)} = x{\left( n \right)} + jy{\left( n \right)} \) where \( j = {\sqrt { - 1} } \). Fourier descriptors and normalized shape factors to characterize roughness may be derived from S d(n).3,18

Pohlman et al.19 defined the signature of a given contour of a breast mass as the radial distance to the contour from its centroid expressed as a function of the angle of the radial line in the interval (0°, 360°). Such a function could be multivalued for an irregular or spiculated contour. The signature computed in this manner would also be undefined, in certain ranges of the angle, for a contour for which the centroid falls outside the region enclosed by the contour.

A major advantage with the use of 1D signatures is the reduction in dimensionality from the corresponding 2D contours. Signatures may be filtered or processed for the reduction of noise and artifacts in the contour. Furthermore, shape factors may be derived more easily from 1D signatures than from the corresponding 2D contours.

THE TURNING ANGLE FUNCTION

The turning angle function T C (S n ) of the contour C is the cumulative function of turning angles and may be obtained by deriving the counterclockwise angle between the tangent at the segment sn and the x-axis and expressing it as a function of the arc length of S n . The turning angle function is also known as the tangent function and has been used as a signature to represent the shape of a given contour (or its polygonal model) and used in applications related to shape retrieval2027. The turning angle function keeps track of the turning angle of the contour, increasing with convex regions and decreasing with concave regions. The turning angle of a segment S i is the difference or step between T C (S i + 1) and T C (S i ). The turning angle ranges in the interval (−180°, 180°). Negative values represent concave regions and positive values represent convex regions. For a convex contour, T C (S n ) is a monotonic function, starting at some value φ and increasing to φ + 2π. For a non-convex polygon, T C (S n ) can become arbitrarily large, as it accumulates the total amount of turning angles, obeying the range of 2π between the starting point and the final point.27

Figures 1 and 2 show the turning angle functions of the contours of a benign mass and a malignant tumor, respectively. It is evident that the turning angle function of the malignant tumor is more complex and rough than that of the benign mass. However, the turning angle functions include noise due to artifacts in the contours and need to be filtered before further analysis.

Fig 1
figure 1

a A manually drawn contour of a benign mass with a relatively smooth and convex contour with 916 points (pixels) and resolution of 50 μm per pixel. b Turning angle function of the contour with TC(Sl) = 0° and TC(S916) = 360°.

Fig 2
figure 2

a A manually drawn contour of a malignant tumor with a spiculated contour including concave and convex segments with 2,478 points (pixels) and resolution of 50 μm per pixel. b Turning angle function of the contour with TC(Sl) = 315° and TC(S2478) = 675°.

SIGNATURE BASED ON THE TURNING ANGLE FUNCTION

Figures 1b and 2b illustrate the turning angle functions of the contours of a benign mass and a malignant tumor, respectively. It is readily seen that while the former is a nearly monotonically increasing function (except for the effects of noise or minor variations), the latter has many decreasing and increasing segments related to the concave and convex regions present in the contour of the tumor. The examples indicate that the turning angle function may be used to represent the complexity as well as the variations present in the shapes of breast masses and tumors.

However, computer-detected contours and hand-drawn contours, such as those shown in Figures 1a and 2a, could contain artifacts and noise related to hand tremor and other limitations. As a consequence, the turning angle function could be expected to contain several small segments that are insignificant in the representation of the contours for further analysis, as highlighted in Figure 3. For this reason, we propose to filter the turning angle function in a selective manner so as to remove the artifacts and noise while preserving the significant details.14 The proposed filtering procedure is an iterative process controlled by the size of the segments and the turning angle between adjacent segments. Two rules are applied to every linear segment S i identified from the turning angle function in each iteration:

  1. Rule 1

    if the current segment S i and the next segment S i + 1 are both shorter than a threshold S min , then join S i and S i + 1. The length of the combined segment is equal to the length of the straight line connecting the starting point of S i and the ending point of S i + 1 in the original contour domain. The turning angle of the combined segment is derived as described in Section 2.

  2. Rule 2

    if the length of S i or S i + 1 is greater than the threshold S min , then analyze the turning angle between S i and S i + 1: if (180° − abs(T C (S i + 1) − T C (S i ))) ≥ θmax, then join S i and S i + 1; else retain S i and S i + 1. The procedure for joining two segments is described under Rule 1.

Fig 3
figure 3

A manually drawn contour of a malignant tumor: a Adjacent segments within the dashed ellipse possess high internal angles and are small (caused by hand tremor or other limitations). Some adjacent segments within the solid ellipse present relevant internal angles. b The respective artifacts of the highlighted regions on the contour represented in the turning angle function. The region inside the dashed ellipse is represented in the turning angle function as the region between the dashed lines with a sequence of small segments with different directions. The region inside the solid ellipse is represented in the turning angle function as a sequence of segments of different sizes with large changes in direction.

The threshold S min represents the relevance of a segment and θ max indicates the relevance of the turning angle between two adjacent segments of the contour being analyzed. The relevance of the segment is related to the resolution of the image and the requirements of the application. A high value for θ max means that when the internal angle between two adjacent segments is large, then the segments should be joined. The procedure stops when no segments are joined in an iteration. Figure 4 illustrates the filtering procedure applied to the turning angle function in Figure 3. Note that small segments with no relevant angles have been joined, resulting in a smoother turning angle function. Figures 5 and 6 illustrate the filtered versions of the turning angle functions and the contours corresponding to those in Figures 1 and 2, with S min = 10 pixels (equivalent to 0.5 mm according to the image resolution) and θ max = 170°. The filtered turning angle function maintains all the relevant information to reconstruct a polygonal model of a given contour.14,15 The resulting polygonal model is free of major artifacts and noise, while preserving important spicules and lobules in the given contour.

Fig 4
figure 4

a Filtered version of the contour in Figure 3a with reduction of artifacts. b Filtered version of the turning angle function in Figure 3b with Smin = 10 pixels and θmax = 170°.

Fig. 5
figure 5

a Filtered version of the contour in Figure 1a with reduction of artifacts. b Filtered version of the turning angle function in Figure 1b with Smin = 10 pixels and θmax = 170°.

Fig 6
figure 6

a Filtered version of the contour in Figure 2a with reduction of artifacts. b Filtered version of the turning angle function in Figure 2b with Smin = 10 pixels and θmax = 170°.

Although the filtered turning angle function preserves only the significant angles and segments of the original contour, the successive increasing or decreasing sections (see Figs. 5 and 6) do not give any extra information to derive shape factors related to the complexity of the contour, such as fractal dimension and index of convexity. For this reason, we propose to further smooth the filtered turning angle function with the aim of retaining information only about the presence of concave and convex regions in the original contour.

The smoothed filtered turning angle function, which will be referred to in the rest of this paper as the signature based on the turning angle function (or STAF for short), is obtained by replacing each monotonically increasing or decreasing section of the filtered turning angle function by a representative segment and its corresponding turning angle. The new segment length is obtained by summing all related individual segment lengths in the increasing or decreasing section, and the new turning angle is obtained by computing the average of the relative turning angles of the corresponding segments. The STAFs of the benign mass in Figure 1 and the malignant tumor in Figure 2 are shown in Figure 7. Note that the STAF of a nearly convex contour is almost constant, as illustrated in Figure 7a; on the other hand, the STAF of a contour with concavities possesses several variations, as shown in Figure 7b. The STAF, as computed above, does not permit the reconstruction of the original contour or any filtered version thereof.

Fig 7
figure 7

Signatures based on the turning angle function with S min = 10 pixels and θ max = 170° of: a The benign mass with a nearly convex contour shown in Figure 1; b The malignant tumor with a spiculated contour shown in Figure 2. See also Figures 5 and 6.

FEATURE EXTRACTION FROM THE STAF

In this section, we present a set of feature descriptors derived from the STAF of the given contour. The proposed feature descriptors include two different measures of fractal dimension, indices that represent the presence of concave and convex regions in the contour, and an index of convexity.

Fractal Dimension

Fractal analysis may be used to study the complexity and roughness of 1D functions, 2D contours, and images.8,2833 Fractal analysis may be applied to classify breast masses based on the complexity of their contours.8 Matsubara et al.34 obtained 100% accuracy in the classification of 13 breast masses using FD. The method required the computation of a series of FD values for several contours of a given mass obtained by thresholding the mass at many levels; the variation in FD was used to categorize a given mass as benign or malignant. Pohlman et al.19 obtained a classification accuracy of more than 80% with fractal analysis of signatures of contours of masses based on the radial distance as described in Section 1.2. Rangayyan and Nguyen8 estimated the FD of a set of 111 contours of breast masses and tumors using the ruler and the box-counting methods applied to the 2D contours as well as their 1D signatures [S d(n) as described in Sect. 1.2]. The best classification performance with A z = 0.89 was obtained with the ruler method applied to the 1D signatures of the contours.

In the present work, the ruler method is applied to the STAFs of the contours of breast masses (referred to as FD TA ) and to the first derivatives of the STAF (FDd TA ). See Rangayyan and Nguyen8 for details on the ruler method. Each STAF is normalized along both axes to the interval [0,1]. The slope of the curve log(r) vs log(N*r), that is, the log of the size of the ruler (r) vs the log of the number of rulers (N) times the ruler size, is obtained as an estimate of FD TA or FDd TA .

Index of the Presence of Concave Regions and Index of the Presence of Convex Regions

A study of the presence of concave or convex regions may be used to characterize a given contour according to relevant changes in the direction. Related features may be used to classify contours of breast masses as benign or malignant.2,4 Such information could be used to discriminate between lobulated or spiculated contours and relatively smooth or convex contours. To characterize the roughness of a contour, we propose the indices VR TA and XR TA to measure the presence of concave regions and convex regions in a given contour, respectively. Both indices are normalized to the interval [0,1].

VR TA is defined as

$$ \operatorname{VR} _{{TA}} = \frac{{{\sum\nolimits_{i = 1}^{\operatorname{Nd} } {{\left( {1 + \cos {\left( {\theta {\left( i \right)}} \right)}} \right)}L_{a} {\left( i \right)}} }}} {{2{\sum\nolimits_{i = 1}^{\operatorname{Nd} } {L_{a} {\left( i \right)}} }}} $$
(1)

where L a (i) is the sum of the lengths of two adjacent segments S i and Si + 1, joined by a drop in the turning angle θ(i), obtained from the STAF, and Nd is the number of drops in angle in the STAF. For a convex contour, the value for VR TA is equal to zero.

XR TA is defined as

$$ \operatorname{XR} _{{TA}} = 1 - {\left( {\frac{{{\sum\nolimits_{j = 1}^{\operatorname{Ni} } {{\left( {1 + \cos {\left( {\phi {\left( j \right)}} \right)}} \right)}L_{\operatorname{b} } {\left( j \right)}} }}} {{2{\sum\nolimits_{j = 1}^{\operatorname{Ni} } {L_{b} {\left( j \right)}} }}}} \right)} $$
(2)

where L b (j) is the sum of the lengths of two adjacent segments S j and Sj + 1, joined by an increase in the turning angle ϕ(j), obtained from the STAF, and Ni is the number of steps with increasing angles in the STAF. For a convex contour, the value for XR TA is equal to 1.

Index of Convexity

The index of convexity CX TA combines information regarding the presence of concave regions and convex regions in the contour.

CX TA is defined as

$$ \operatorname{CX} _{{TA}} = 1 - {\left( {\frac{{\operatorname{VR} _{{TA}} }} {2} + \frac{{1 - \operatorname{XR} _{{TA}} }} {2}} \right)} $$
(3)

CX TA is normalized to the interval [0,1]. For a convex contour, the value for CX TA is equal to 1. The index decreases as the presence of concave regions increases.

DATA USED: CONTOURS OF BREAST MASSES

Mammograms of 20 cases were obtained from Screen Test: the Alberta Program for the Early Detection of Breast Cancer.5,35,36 The mammograms were digitized using the Lumiscan 85 scanner at a resolution of 50 μm with 12 b/pixel. Fifty-seven regions of interest, of which 37 are related to benign masses and 20 are related to malignant tumors, were obtained.5 The sizes of the benign masses vary in the range 39–437 mm2, with an average of 163 mm2 and a standard deviation of 87 mm2. The sizes of the malignant tumors vary in the range 34–1,122 mm2, with an average of 265 mm2 and a standard deviation of 283 mm2. Most of the benign masses in this dataset are smooth or macrolobulated, whereas most of the malignant tumors are spiculated or microlobulated.

Mammograms containing masses were also obtained from the Mammographic Image Analysis Society (MIAS, UK) database37,38 and the teaching library of the Foothills Hospital (Calgary)3,4. The MIAS images were digitized at a resolution of 50 μm with 8 b/pixel. The Foothills Hospital images were digitized at 62 μm per pixel with 8 b/pixel. This set includes 28 benign masses and 26 malignant tumors with smooth, lobulated, and spiculated contours in both the benign and malignant categories. The sizes of the benign masses vary in the range 32–1,207 mm2, with an average of 281 mm2 and a standard deviation of 288 mm2. The sizes of the malignant tumors vary in the range 46–1,244 mm2, with an average of 286 mm2 and a standard deviation of 292 mm2.

Contours of the masses in the images described above were drawn by an expert radiologist specialized in mammography. The combined dataset used in the present study include 111 contours, with typical and atypical shapes of 65 benign masses and 46 malignant tumors. The diagnostic classification was based upon biopsy (the present work employs the same dataset as that used by Rangayyan and Nguyen8).

RESULTS AND DISCUSSION

The methods were tested with a set of 111 contours of breast masses; see Section 5 for details regarding the dataset used. Figures 8, 9, 10 and 11 present a set of contours of breast masses and tumors with different shapes, and the respective STAF with S min = 10 pixels and θ max = 170°. It is worth noting that a convex contour, as shown in Figure 8a, possesses a STAF represented by a constant, resulting in a value equal to zero for FD TA , FDd TA , and VR TA , and a value equal to 1 for XR TA and CX TA .

Fig 8
figure 8

a A manually drawn contour of a circumscribed benign mass. FDTA = 0.0, FDTA = 0.0, VRTA = 0.0, XRTA = 1.0, and CXTA = 1.0. b The signature based on the turning angle function with Smin = 10 pixels and θmax = 170°.

Fig 9
figure 9

a A manually drawn contour of a macrolobulated malignant tumor. FDTA = 0.14, FDdTA = 0.11, VRTA = 0.45, XRTA = 0.98, and CXTA = 0.76. b The signature based on the turning angle function with Smin = 10 pixels and θmax = 170°.

Fig 10
figure 10

a A manually drawn contour of a microlobulated malignant tumor. FDTA = 0.32, FDdTA = 0.30, VRTA = 0.24, XRTA = 0.78, and CXTA = 0.77. b The signature based on the turning angle function with Smin = 10 pixels and θmax = 170°.

Fig 11
figure 11

a A manually drawn contour of a spiculated malignant tumor. FDTA = 0.61, FDdTA = 0.57, VRTA = 0.42, XRTA = 0.64, and CXTA = 0.61. b The signature based on the turning angle function with Smin = 10 pixels and θmax = 170°.

To evaluate the proposed methods, the STAF of the 111 contours in the dataset were obtained with several sets of parameters. The shape descriptors VR TA , XR TA , CX TA , FD TA , and FDd TA were derived for each STAF. A sliding threshold was applied to each feature descriptor directly to classify the corresponding mass as benign or malignant. This classification strategy was used because in each experiment, each contour is represented by only one feature; consequently, the classifier does not require any training step. The diagnosis of each mass, as provided by biopsy, was used to validate the classification. The true-positive fraction (TPF) and false-positive fraction (FPF) were computed for each threshold using the results for all of the 111 contours. To evaluate the classification performance of each feature, an ROC curve39 was generated for each experiment, with the sensitivity given by the TPF and the specificity given as 1-FPF.

To evaluate the impact of the choice of S min on the final results, we tested the features with three different values of S min (5, 10, and 20 pixels), with θ max set at 170°. The area Az under the ROC curve was computed for each case to serve as a measure of the classification performance of the corresponding feature, as summarized in Table 1. Note that when the indices are applied to characterize contours of lesions in mammograms, with the pixel resolution being 50 or 62 μm, the best results were obtained with S min set at 10 pixels, which is equivalent to 0.5 or 0.6 mm. The parameters need to be adjusted according to the requirements of the application. Because the performance of the classifier changes when different parameters are used in deriving the polygonal models, we analyzed the statistical significance of the difference between the features for each set of feature descriptors. Considering the parameter S min = 10 pixels as the reference, the results obtained using FDd TA with all the different combinations of the parameters and FD TA and VR TA with S min = 20 pixels do not present differences that are statistically significant with p ≤ 0.05.

Table 1 Comparison of the Classification Performance of the Proposed Indices with Different Values for S min, and θ max set at 170°

To compare the performance of the results obtained with the proposed features with the results of other features, we computed the A z value for the shape factor F cc 4 as well as FD obtained using the ruler method applied to 1D signatures of contours as reported by Rangayyan and Nguyen.8 The results are also shown in Table 1. The table also lists, for comparison, the A z values for a spiculation index, compactness, and a Fourier-descriptor-based shape factors as obtained by Rangayyan and Nguyen.8 It is seen that the shape features proposed in the present paper have provided the best results. Further studies are required to evaluate the classification performance of various combinations of the shape features proposed in the present work and previous related works.

The shape factors proposed in this work could also be used to classify breast masses and tumors in terms of margins that are circumscribed, macrolobulated, microlobulated, or spiculated. Such categorization could assist in the preparation of reports in accordance with the terminology used in BI-RADS.2 Further work is in progress to derive fuzzy rules to classify breast mass using various combinations of the proposed shape factors.

CONCLUSION

We have proposed methods to obtain shape features from signatures based on filtered turning angle functions of contours. The features have been shown to be useful in the analysis of contours of breast masses and tumors because of their ability to capture diagnostically important details of shape related to spicules and lobulations. The proposed features have provided high classification accuracies in discriminating between benign breast masses and malignant tumors. The methods should be useful in computer-aided diagnosis of breast cancer.