Abstract
This study presents a computer-aided diagnosis (CAD) system with textural features for classifying benign and malignant breast tumors on medical ultrasound systems. A series of pathologically proven breast tumors were evaluated using the support vector machine (SVM) in the differential diagnosis of breast tumors. The proposed CAD system utilized facile textural features, i.e., block difference of inverse probabilities, block variation of local correlation coefficients and auto-covariance matrix, to identify breast tumor. An SVM classifier using the textual features classified the tumor as benign or malignant. The proposed system identifies breast tumors with a comparatively high accuracy. This can help inexperienced physicians avoid misdiagnosis. The main advantage of the proposed system is that the training and diagnosis procedure of SVM are faster and more stable than that of multilayer perception neural networks. With the expansion of the database, new cases can easily be gathered and used as references. This study dramatically reduces the training and diagnosis time. The SVM is a reliable choice for the proposed CAD system because it is fast and excellent in ultrasound image classification.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Breast cancer is one of the leading causes of deaths from cancer for the female population in both developed and developing countries. Earlier treatment requires early diagnosis, and early diagnosis requires an accurate and reliable diagnostic procedure that allows physicians to differentiate benign breast tumors from malignant ones [1]. The most frequently adopted medical imaging studies for early detection and diagnosis of breast cancers include mammography and ultrasonography. Ultrasound examination, which is non-invasive and non-radiative, is a more convenient and suitable tool for palpable tumors in daily clinical practice. However, ultrasound examination is very operator dependent. The image is non-specific for the diagnosis of benign or malignant lesions according to the echogenecity. The examination described by Stavros et al. [2] is much more extensive than the usual examinations performed at most breast imaging centers. However, the above diagnostic results are achieved by experienced radiologists. Many invasive diagnostic procedures are still required in most cases. Most of these procedures could be avoided if a more specific diagnostic test was available because the rate of positive findings in a biopsy for cancer is low [3]. Thus, a computer-aided diagnosis (CAD) system would be expected to be helpful in diagnosing breast cancer because of the difficulty of such diagnoses.
Chen et al. [4–7] applied textural features in breast ultrasound images to differentiate between benign and malignant tumors with neural network classifiers. Textural variation in the ultrasound image has been deemed a useful characteristic for distinguishing benign and malignant tumors [8]. The CAD utilizes a multilayer perception (MLP) neural network to perform a good diagnostic result. However, the training process is prolonged and diagnostic performance normally relies on the initial parameter setting [9], i.e., number of neurons, learning rate and moment value are hard to decide. The selections of initial parameters will affect the results drastically. Whereas, the support vector machine (SVM) reveals the feasibility and superiority to extract higher-order statistics. The SVM has become extremely popular in terms of classification and prediction. This study employs the SVM model as a classifier instead of MLP for identifying benign and malignant lesions in the ultrasound image. The proposed diagnosis system can classify the ultrasound images of a breast more accurately and efficiently. The SVM is a reliable choice for the new proposed system because it is fast and excellent in ultrasound image classification.
2 Materials and methods
Normally, a physician can readily pinpoint a tumor in an ultrasound image by the tumor shape and the contrast of internal echoes. Automatic tumor segmentation on an ultrasound image is hard. No satisfactory approaches appear to exist so far, to our knowledge. Thus, the physician manually extracted the rectangular sub-image of the region of interest (ROI) in this study. The rectangular ROI included around 0–5 mm extension from the tumor border. The proposed system employed intensity variation and textural information from the ROI sub-images as features with which to diagnose breast tumors.
2.1 Data acquisition
The ultrasound image database comprises 250 images of pathologically proven benign breast tumors from 215 patients and carcinomas from 35 patients. The ultrasound images were captured at the largest diameter of the tumor. The breast ultrasound image databases contain only histological confirmed cases (either by fine needle aspiration, core needle biopsy or open biopsy). Ultrasound images were performed using an ALOKA SSD 1200 (Tokyo, Japan) scanner and with freeze-frame capability and 7.5 MHz linear transducer. No acoustic standoff pad was used in any of the cases. The databases were supplied by the coauthor, an experienced physician, Dr. Chen. Most of the cases were pathologically proven and some were followed up at least for 2 years.
The monochrome ultrasound images in databases are quantized into 256 gray levels, i.e., eight bits per pixel. The ROI is manually selected by Dr. Chen using the ProImage package. That is, the hardware-bundled software package was used to capture the full image of the ultrasonic scanner and to select the ROI manually. The ROI was then saved as a digital form for later analysis by the proposed SVM system. Figure 1a demonstrates a real-time digitized monochrome ultrasound image. Figure 1b presents an exacted ROI for the tumor.
2.2 Ultrasonic textural analysis
An ultrasound image consists of many points with different values of gray level intensity. Different tissues have significantly different textures. The textural variation between benign and malignant is an effective feature for classifying breast tumors. The proposed CAD system exploits the correlation between adjacent pixels within images as features to classify breast tumor. We utilized textural features, i.e., block difference of inverse probabilities (BDIP), block variation of local correlation coefficients (BVLC) [10] and auto-covariance matrix, to identify breast tumor.
Block difference of inverse probabilities, which is one of the proposed texture features, is defined as the difference between the number of pixels in a block. That is
where B denotes a block of size P × P. The larger the variation of intensities there is in a block, the higher the value of BDIP. BVLC, the second texture feature, is known to measure texture smoothness well. The value of BVLC is defined as follows:
and
where μ 0,0 and σ 0,0 represent the local mean value and standard deviation of the block with size P × P. The (k, l) term denotes four orientations (−90°, 0°, −45°, 45°). As a result, μ k,l and σ k,l represent the mean value and standard deviation of the shifted block, respectively. The larger BVLC value indicated that the ingredients in the block are rough. In this study, P is chosen to be 2. The third texture feature we propose, the auto-covariance matrix, can reflect the inter-pixel correlation within an image. The modified auto-covariance coefficients between pixel (i, j) and pixel (i + Δm, j + Δn) in an image with size M × N is defined as
where
where μ is the mean value of f(x, y). The size of the auto-covariance matrix was Δm × Δn. These texture features were performed as feature vector for each tumor ROI sub-image to diagnose breast cancer.
2.3 Support vector machine classification
The aim of SVM is to devise a computationally efficient way of learning separating hyperplanes in a high-dimensional feature space [11]. The SVMs have been shown to be an efficient method for many real-world problems because of its high generalization performance without the need to add a priori knowledge. Thus, SVMs have much attention as a successful tool for classification [12, 13], image recognition [14, 15] and bioinformatics [16, 17]. The SVM model can map the input vectors into a high-dimensional feature space through some non-linear mapping, chosen a priori. In this space, an optimal separating hyperplane is constructed. SVM is the implementation of the structural risk minimization principle whose object is to minimize the upper bound on the generalization error. Given a set of training vectors (l in total) belonging to separate classes, (x 1, y 1), (x 2, y 2), (x 3, y 3), ..., (x l, y l), where \( x_{i} \in R^{n} \) denotes the ith input vector and \( y_{i} \in \{ + 1, - 1\} \) is the corresponding desired output. The maximal margin classifier aims to find a hyperplane w: wx + b = 0 to separate the training data. In the possible hyperplanes, only one maximizes the margin and the nearest data point of each class. Figure 2 shows the optimal separating hyperplane with the largest margin. The support vectors denote the points lying on the margin border. The solution to the classification is given by the decision function
where α i is the positive Lagrange multiplier, s i is the support vector (N SV in total) and k(s i , x) is the function for convolution of the kernel of the decision function. The radial kernels perform best in our experimental comparison, and, hence, are chosen in the proposed diagnosis system. The radial kernels are defined as
where \( \gamma \in R \) is a non-zero parameter.
In this study, the modified normalized auto-covariance matrix and a set of the first and second moments of BDIP and BVLC are used as the feature vectors for the input of the SVM classifier. In this study, both Δm and Δn are 5. Because the value of γ(0, 0) is always 1, excluding the element γ(0, 0), other texture features are formed as a 28-D image feature vector. The 28-D textural feature vector is used as the input signal of the SVM classifier. Take note that the output value of the SVM is either −1 or 1. When the output value of an ROI sub-image of a suspicious tumor region is near enough to 1, the CAD system will classify the tumor in the ultrasound image as malignant. Conversely, when the output value is close to −1, the tumor will be diagnosed as benign.
3 Simulations and results
The most common means of measuring diagnostic accuracy for reconstructed images is based on receiver operating characteristic (ROC) analysis. The other measure was the A Z value, which was calculated by the ROC curves (software package LABROC1 by Prof. C.E. Metz, University of Chicago, IL, USA). The area A Z under the ROC curve is an index of the quantitative measure of the overall performance of a diagnosis [18], as shown in Fig. 3. A Z value could, therefore, compare performances using different methods to clearly distinguish positive and negative finding of breast tumors. Usually, the overall performance of a diagnostic system can be evaluated by examining the ROC area index, A Z , over the testing output values. The k-fold cross-validation method [19] is used to estimate the performance of the proposed SVM system and the MLP system designed by Chen et al. [7] (denoted by Chen’s CAD). Once trained, the CAD systems are then tested on the group that was set aside. The second group is then removed, and the remaining (k−1) groups are trained and the CAD systems are tested on the excluded group. This process is repeated until all the k groups have been used in turn as the group that is set aside and used for testing. In the simulations, k is 10 and each group has 25 ultrasound images. Because the radial kernels perform best in the experimental results, the kernels are chosen in the proposed SVM diagnosis system. Figure 4 shows the diagnosis performance for the SVM system with different γ values. With γ ranging from 0.01 to 0.02, the proposed CAD system obtains a stable and high accuracy.
Figures 5 and 6 illustrate the diagram of the ROC curve for Chen’s CAD and the proposed SVM system in the classification of malignant and benign tumors. The proposed SVM system and Chen’s CAD achieve A Z =0.9561 and 0.9496, respectively. Table 1 lists the number of misdiagnosed cases of the Chen’s CAD (threshold=0.2) and the proposed SVM system for each test set. The accuracy, sensitivity, specificity, PPV and NPV of Chen’s CAD and the proposed SVM system are illustrated in Table 2. We also compare the computation time for the CAD systems. The simulations were made on a single CPU Intel Pentium-4® 2.4 GHz personal computer with Microsoft Windows XP® operating system. Table 3 shows the training for the ultrasound image database and the average diagnosis time for each breast tumor. The training and average diagnosis time of the Chen’s CAD is 778 and 1,994 times longer than that of the proposed CAD, respectively. In all the simulations, the results from the proposed CAD system obtain better classification performance and speedy computation than those obtained with Chen’s CAD.
4 Conclusions
Ultrasound has become one of the major imaging modalities for the diagnosis of breast lesions. Improved imaging techniques permit the management of detected breast lesions to become less invasive. This study proposes an efficient CAD system using the SVM model to differentiate between benign and malignant tumors. CAD systems have been developed for diagnosis of many human diseases on medical imaging. An ingenious CAD system is able to assist radiologists and physicians in detecting lesions and in differentiating benign from malignant lesions on the basis of medical images. The results produced by CAD can be used as a “second opinion” to assist radiologists in their interpretations and to improve diagnostic accuracy. To avoid needless biopsy and enhance the diagnostic accuracy, a CAD system can provide a second beneficial support reference. The proposed CAD system diagnoses breast tumors using texture features within the ultrasound image. The MLP classifier in Chen’s CAD is replaced by the SVM model to obtain a better result for distinguishing between benign and malignant cases. The proposed CAD performs differential diagnosis very well. From the highly satisfactory specificity and sensitivity of results, the proposed CAD is expected to be a helpful tool for classifying benign and malignant tumors in ultrasound images. The MLP system suffered from the time consuming and initial condition dependent problems. The proposed SVM approach identifies solid breast nodules with a comparatively high accuracy. With the expansion of the database, new cases can easily be trained and used as references. Experimental results demonstrate the feasibility and excellent performance of the proposed CAD system in ultrasound image classification.
References
American Cancer Society (2003) Breast cancer facts and figures 2001–2002. American Cancer Society, Atlanta, Georgia
Stavros AT, Thickman D, Rapp CL, Dennis MA, Parker SH, Sisney GA (1995) Solid breast nodules: use of sonography to distinguish between benign and malignant lesions. Radiology 196(1):123–134
Bassett LW, Liu TH, Giuliano AE, Gold RH (1991) The prevalence of carcinoma in palpable vs impalpable, mammographically detected lesions. AJR Am J Roentgenol 157(1):21–24
Chen D, Chang RF, Huang YL (2000) Breast cancer diagnosis using self-organizing map for sonography. Ultrasound Med Biol 26(3):405–411
Chen DR, Chang RF, Huang YL, Chou YH, Tiu CM, Tsai PP (2000) Texture analysis of breast tumors on sonograms. Semin Ultrasound CT MR 21(4):308–316
Chen DR, Chang RF, Kuo WJ, Chen MC, Huang YL (2002) Diagnosis of breast tumors with sonographic texture analysis using wavelet transform and neural networks. Ultrasound Med Biol 28(10):1301–1310
Chen DR, Chang RF, Huang YL (1999) Computer-aided diagnosis applied to US of solid breast nodules by using neural networks. Radiology 213(2):407–412
Garra BS, Krasner BH, Horii SC, Ascher S, Mun SK, Zeman RK (1993) Improving the distinction between benign and malignant breast-lesions—the value of sonographic texture analysis. Ultrason Imaging 15(4):267–285
Haykin S (1999) Multilayer perceptrons. In: Marcia Horton (ed) Neural networks: a comprehensive foundation, 2nd edn. Prentice-Hall, New Jersey, pp. 156–255
Yang DC, Sang YS (2003) Image retrieval using BDIP and BVLC moments. IEEE Trans Circuits Syst Video Technol 13(9):951–957
Christianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, UK
Kim KI, Jung K, Park SH, Kim HJ (2002) Support vector machines for texture classification. IEEE Trans Pattern Anal Mach Intell 24(11):1542–1550
Song Q, Hu WJ, Xie WF (2002) Robust support vector machine with bullet hole image classification. IEEE Trans Syst Man Cyber C Appl Rev 32(4):440–448
El Naqa I, Yang YY, Wernick MN, Galatsanos NP, Nishikawa RM (2002) A support vector machine approach for detection of microcalcifications. IEEE Trans Med Imaging 21(12):1552–1563
Yang MH, Roth D, Ahuja N (2002) A tale of two classifiers: SNoW vs. SVM in visual recognition. Comput Vis—ECCV 2353(Pt IV):685–699
Sun YF, Fan XD, Li YD (2003) Identifying splicing sites in eukaryotic RNA: support vector machine approach. Comput Biol Med 33(1):17–29
Song MH, Breneman CM, Bi JB, Sukumar N, Bennett KP, Cramer S et al (2002) Prediction of protein retention times in anion-exchange chromatography systems using support vector regression. J Chem Inf Comput Sci 42(6):1347–1357
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36
Weiss SM, Kapouleas I (1989) An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. Proceedings of the 11th international joint conference on artificial intelligence. Morgan Kaufmann, Detroit, pp. 234–237
Acknowledgment
This work was supported by the National Science Council, Taiwan, Republic of China, under Grant NSC 93-2213-E-029-014.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, YL., Wang, KL. & Chen, DR. Diagnosis of breast tumors with ultrasonic texture analysis using support vector machines. Neural Comput & Applic 15, 164–169 (2006). https://doi.org/10.1007/s00521-005-0019-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-005-0019-5