Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis

Bai, Xiaolong; Niwas, Swamidoss Issac; Lin, Weisi; Ju, Bing-Feng; Kwoh, Chee Keong; Wang, Lipo; Sng, Chelvin C.; Aquino, Maria C.; Chew, Paul T. K.

doi:10.1007/s10916-016-0436-2

Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis

Patient Facing Systems
Published: 21 January 2016

Volume 40, article number 78, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Medical Systems Aims and scope Submit manuscript

Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis

Download PDF

Xiaolong Bai^1,2,
Swamidoss Issac Niwas³,
Weisi Lin⁴,
Bing-Feng Ju¹,
Chee Keong Kwoh⁴,
Lipo Wang²,
Chelvin C. Sng⁶,
Maria C. Aquino⁵ &
…
Paul T. K. Chew⁶

619 Accesses
36 Citations
Explore all metrics

Abstract

Classification of different mechanisms of angle closure glaucoma (ACG) is important for medical diagnosis. Error-correcting output code (ECOC) is an effective approach for multiclass classification. In this study, we propose a new ensemble learning method based on ECOC with application to classification of four ACG mechanisms. The dichotomizers in ECOC are first optimized individually to increase their accuracy and diversity (or interdependence) which is beneficial to the ECOC framework. Specifically, the best feature set is determined for each possible dichotomizer and a wrapper approach is applied to evaluate the classification accuracy of each dichotomizer on the training dataset using cross-validation. The separability of the ECOC codes is maximized by selecting a set of competitive dichotomizers according to a new criterion, in which a regularization term is introduced in consideration of the binary classification performance of each selected dichotomizer. The proposed method is experimentally applied for classifying four ACG mechanisms. The eye images of 152 glaucoma patients are collected by using anterior segment optical coherence tomography (AS-OCT) and then segmented, from which 84 features are extracted. The weighted average classification accuracy of the proposed method is 87.65 % based on the results of leave-one-out cross-validation (LOOCV), which is much better than that of the other existing ECOC methods. The proposed method achieves accurate classification of four ACG mechanisms which is promising to be applied in diagnosis of glaucoma.

Reliable Feature Selection for Automated Angle Closure Glaucoma Mechanism Detection

Article 08 February 2015

An Ensemble Framework for Glaucoma Classification Using Fundus Images

Using Deep Features Extraction and Ensemble Classifiers to Detect Glaucoma from Fundus Images

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Angle closure glaucoma (ACG) is a prevalent eye disease in Asia, which is a major cause of blindness [1]. There are mainly four mechanisms underlying ACG, namely, pupil block (PB), plateau iris configuration (PL), thick peripheral iris roll (PIR) and exaggerated lens vault (LV). Specific treatments are required for different ACG mechanisms. So classification of these four mechanisms is clinically important in order to provide better treatment for ACG patients [2]. Anterior segment optical coherence tomography (AS-OCT), is capable of providing high resolution images of anterior chamber of eyes and it has been used extensively for glaucoma detection [1–3]. Anterior chamber (AC) and AC angle parameters provided by AS-OCT were used to evaluate different ACG mechanisms [2, 4]. Wirawan et al. selected ten discriminative features from 84 parameters which were measured from segmented AS-OCT images of the patients, and used AdaBoost classifier for classification of these four different mechanisms [5].

While there are many existing methods for glaucoma detection, differentiating glaucomatous eyes from normal ones using features extracted from the fundus images [6], optic nerve head stereo photographs [7], or OCT [8] in combination with various classifiers, few studies have been done for classification of different mechanisms of ACG in spite of its clinical importance. So, our study is motivated by developing a new multiclass classification method with high accuracy in recognition of different ACG mechanisms. Traditionally, there are two categories of approaches to address the multiclass classification problems. One is to develop a single model for all the classes. Some classifiers such as k-Nearest Neighborhood (kNN), decision tree, Naive Bayes, and linear discriminative analysis (LDA) are naturally designed to deal with multiclass classification.

In contrast, some more advanced classifiers such as support vector machine (SVM), and AdaBoost are binary classifiers. Hence, it is required to design a set of different binary classifiers (dichotomizers) and combine them to deal with multiclass classification tasks. On the other hand, ensemble learning by combining multiple dichotomizers is more advantageous than traditional single classifier in that it has superior generalization ability and is suitable for learning nonlinear classification boundary. Intuitively, to differentiate c classes, c dichotomizers are trained, and each of which discriminates one specific class from all the remaining classes. For a test sample, the classifier with largest probability output will be taken as the predicted class label. This is the well-known one-versus-all (OVA) strategy [9, 10]. An alternative to this is one-versus-one (OVO), where c.(c-1)/2 dichotomizers are trained to separate each pair of classes.

To solve the multiclass problems using binary classifiers, Dietterich et al. provided a unified framework based on a set of error-correcting output codes (ECOC), which was further improved by [11, 12]. This framework is well known for its nonlinear classification capability. A code matrix with n columns was designed to decompose the multiclass problems into n binary ones. Then the outputs of these N dichotomizers were combined to determine the class label of the test sample [13]. OVA and OVO can be seen as special cases of the ECOC framework. Many research works have been explored to improve the classification performance of ECOC [14–18], especially using data-driven approaches [19–21]. By observing the data distribution in the original feature space, dichotomizers are trained for easily separated pairs of class clusters. However, this observation of data distribution relies on a common feature space.

In this study, we propose a wrapper approach to learn the ECOC code matrix, in which the best feature set for each dichotomizer and best combination of dichotomizers for ECOC are both selected via cross-validation on the training dataset. This method is better than the other ECOC based methods in two aspects: 1) each dichotomizer has its own optimal feature set; 2) a new criterion is proposed that the best combination of dichotomizers is selected in consideration of not only of the separability of the codewords in ECOC framework but also the classification ability of the selected dichotomizers.

This paper is organized as follows. The ECOC framework is reviewed in “ECOC framework” section, followed by the proposed method in “Method for multiclass classification” section. Experimental results of multiclass classification on glaucoma dataset are shown in “Experimental results on classification of different glaucoma mechanisms” section. Conclusion is made in the final section.

ECOC framework

There are two major processes in ECOC framework: coding and decoding. The key of coding process lies in the design of a code matrix М ∈ {−1, 0, 1}^c×n, with c rows and n columns, where c and n denote the numbers of classes and dichotomizers, respectively. The i ^th row of М provides the codeword C _i for the i ^th class (i = 1, 2,…,c). Meanwhile, each column of М represents the partition of classes by each dichotomizer. Classes coded by 1 and −1 are treated as positives and negatives, respectively, while those coded by 0 are omitted in training the dichotomizers. For the four class classification problem, the OVA and OVO strategy are represented by two code matrixes as shown in Table 1a and b, respectively.

Table 1 The ECOC code matrix in (a) OVA strategy and (b) OVO strategy

Full size table

In the decoding process, the outputs of these n trained dichotomizers for the test sample are given as a vector V = {v ₁, v ₂, …,v _n}, and compared with the codeword of each class to find the nearest one to determine the class label of the test sample. There are many decoding strategies to evaluate the distance between the vector V and each codeword C _i [13]. For example, in Loss-based decoding, the distance is formulated as

$$ {\mathrm{d}}_{\mathrm{H}}\left(\mathrm{V},{\mathrm{C}}_{\mathrm{i}}\right)=\frac{1}{2}{\displaystyle {\sum}_{\mathrm{j}=1}^{\mathrm{n}}\mathrm{L}\left(\mathrm{V}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\right)} $$

(1)

where L(.) denotes the loss function which is dependent on the type of dichotomizer.

Most research is focused on the coding process to design an optimal code matrix. Allwein et al. proposed a dense random code matrix by maximizing the distances between the codewords of different classes, which is further extended as sparse random code matrix [21]. Pujol et al. proposed discriminative ECOC (DECOC) to learn the code matrix based on a hierarchical partition of the multiple classes, and used (c-1) dichotomizers in the ECOC code matrix [17]. In most of the existing works, the feature space is usually fixed to facilitate learning the data distribution and designing a problem-dependent code matrix. However, the common feature space is optimal for all the dichotomizers as a whole. So the dichotomizers may not be optimized specifically. The classification performance may not be the best, although the code matrix is designed well in terms of class separability. In this study, all the dichotomizers are first optimized with their own specific feature set, and then the ECOC code matrix is learned by selecting the best combination of dichotomizers. The details are shown in the next section.

Method for multiclass classification

In the proposed method, feature selection for each dichotomizer and dichotomizer selection in ECOC code matrix are performed in tandem to learn the code matrix as shown in Fig. 1. The proposed method is detailed as follows.

Step 1: Feature set optimization for each dichotomizer

Based on combinational analysis, we can calculate that there are totally N different dichotomizers using Eq. (2),

$$ \mathrm{N}=\frac{1}{2}\left({3}^{\mathrm{c}}-{2}^{\mathrm{c}+1}+1\right) $$

(2)

where c is the number of classes. For example, there are 25 dichotomizers for a four-class classification problem such as ACG diagnosis. By using the state-of-the-art feature selection method, such as minimum redundancy and maximum relevancy (mRMR) [22–24], the best feature set, which is closely dependent on the target class with minimum inter-redundancy, is identified for each dichotomizer. Wirawan et al. has shown that mRMR is suitable and effective in selecting informative and discriminative features for ACG classification [5]. In order to find the optimal feature set, a filter-wrapper approach is used [22, 23]. Features are first ranked according to the criteria of mRMR, and the highly ranked features are retained, then sequential forward selection (SFS) (or sequential backward selection (SBS), floating search methods) is performed to select the best feature set for each dichotomizer using cross-validation on the training dataset.

However, since the number of dichotomizers N grows exponentially with the number of classes c in Eq. (2), it may be time-consuming to perform a wrapper approach to select the best feature set for each dichotomizer. For simplicity, filter approach is preferred for fast feature selection when N is large. Feature selection not only improves the classification performance (i.e., accuracy) of each dichotomizer, but also decreases the dependency among the dichotomizers by selecting different optimal feature sets for different dichotomizers, which is beneficial to the error-correcting ability of ECOC.

Step 2: Maximization of the separability of codewords in ECOC framework in consideration of the performance of each dichotomizer

To improve the classification performance of ECOC framework, the separability of ECOC codes is maximized, which is defined as

$$ {\mathrm{d}}_{\mathrm{s}}={ \min}_{\begin{array}{c}\hfill 1\le \mathrm{i}\le \mathrm{c}\hfill \\ {}\hfill \mathrm{i}<k\le c\hfill \end{array}}\left(\frac{1}{2}{\displaystyle {\sum}_{\mathrm{j}=1}^{\mathrm{n}}\mathrm{L}\left({\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{k}}\left(\mathrm{j}\right)\right)}\right) $$

(3)

where C _i and C _k denote the codewords of the i ^th and k ^th classes, respectively [12]. In [21], the separability is modified as

$$ {\mathrm{d}}_{\mathrm{s}}^{\prime }={ \min}_{\begin{array}{c}\hfill 1\le \mathrm{i}\le \mathrm{c}\hfill \\ {}\hfill \mathrm{i}<k\le \mathrm{c}\hfill \end{array}}\left(\frac{1}{2}\left|{\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{k}}\left(\mathrm{j}\right)\right|{\displaystyle {\sum}_{\mathrm{j}=1}^{\mathrm{n}}\mathrm{L}\left({\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{k}}\left(\mathrm{j}\right)\right)}\right) $$

(4)

to ignore the contributions from codes with value of 0. This definition is more reasonable. However, in all the existing ECOC methods, only the code information of the dichotomizers is considered. So some dichotomizers with unsatisfactory ability of binary classification may also be selected into the ECOC framework, which may deteriorate the final classification performance. In the proposed method, the separability is reformulated as

$$ {\mathrm{d}}_{\mathrm{s}}^{{\prime\prime} }={\mathrm{d}}_{\mathrm{s}}^{\prime }+\uplambda \overline{\mathrm{a}}={ \min}_{\begin{array}{c}\hfill 1\le \mathrm{i}\le \mathrm{c}\hfill \\ {}\hfill \mathrm{i}<k\le \mathrm{c}\hfill \end{array}}\left(\frac{1}{2}\left|{\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{k}}\left(\mathrm{j}\right)\right|{\displaystyle {\sum}_{\mathrm{j}=1}^{\mathrm{n}}\mathrm{L}\left({\mathrm{C}}_{\mathrm{i}}\left(\mathrm{j}\right)\cdot {\mathrm{C}}_{\mathrm{k}}\left(\mathrm{j}\right)\right)}\right) + \uplambda \overline{\mathrm{a}} $$

(5)

where ā is the average binary classification accuracy of the selected dichotomizers, and λ is the coefficient weighting the relative importance of ā compared with d _s ^’. There are two key parameters to determine: the weighting coefficient λ and the code length n. Cross-validation is applied on the training dataset to find the optimal parameters λ ^* and n ^*, and also the optimal set of dichotomizers. Finally, the trained and selected dichotomizers are used as base learners in ECOC framework. The algorithm of the proposed method is shown in Table 2.

Table 2 The algorithm of the proposed method

Full size table

Experimental results on classification of different glaucoma mechanisms

Data preparation and experiment results

A dataset of 152 ACG samples provided by National University Hospital Singapore (NUHS) collected over 2 years is used for classification of four different ACG mechanisms [5, 24, 25]. The dataset is small due to the limited number of ACG patients are recruited using Ministry of Education (MoE) AcRF Tire 1 Funding, Singapore. AS-OCT images of glaucoma patients with different ACG mechanisms are shown in Fig. 2. PIR is characterized by the thick and folded iris, while PB is characterized by the convex forward iris profile. Eyes with PL and LV mechanisms have the largest and smallest AC volumes, respectively [1]. Customized software (Anterior Segment Analysis Program-ASAP, National University Hospital, Singapore) was used to measure anterior chamber (AC) characteristics. ASAP software used the level set method for the segmentation of AC area [1, 5] of AS-OCT image as shown in Fig. 3. The quantifiable parameters of the anterior chamber (AC) measured by ASAP software includes anterior chamber depth (ACD), anterior chamber volume (ACV), anterior chamber width (ACW), angle recess area (ARA), angle opening distance (AOD), post closure area (PCA), angle opening distance (AOD), trabecular-iris space area (TISA), lens vault (LV) distance, iris area (IA), iris thickness (IT), iris concavity, etc., as illustrated in Fig. 4. The samples are labeled by medical experts in NUHS (C. C. Sng, M. C. Aquino and P. T. K. Chew) and the basic information of the glaucoma dataset used in this study is shown in Table 3.

Table 3 The basic information of the Glaucoma dataset used in this study

Full size table

Since, each mechanism has several characteristics, from medical point of view, 84 features are extracted, which are all clinically important parameters measured from the segmented AS-OCT image. Some of the important features are identified based on our previous studies using the same dataset [5, 24, 25] are given as follows: AC_Area (Anterior chamber area); AC_Volume (Anterior chamber volume); ACD (Anterior chamber depth); ACW (Anterior chamber width); Anterior_lens_curvature (Curvature of the anterior lens surface); ILC_L (Iridolenticular contact in the left side); ILC_R (Iridolenticular contact in the right side); Iris_area_IL (Iris area contacts with Iridolenticular); Iris_area_L500 (The scleral spur is used as the centre of a circle with radius of 500 μm. The area of iris region inside this circle on left side is Iris_area_L500). Iris_Chord_Length_L (The distance from the tip of the iris to the periphery); Iris_Chord_Length_R (The distance from the tip of the iris to the periphery on right side); Iris_end_concavity_L (Concavity of Iris area at the end in left side); Iris_thickness_L1000 (The intersection point on the anterior surface of the iris is identified when the scleral spur (SS) is used as the centre of a circle with radius of 1000 μm. Iris Thickness is the shortest distance from this point of intersection to the posterior surface of the iris on left side; Iris_thickness_L_DMR (Thickness of the iris region in the dilator muscle region (DMR) on left side; Iris_thickness_L_Max (Maximum thickness of Iris length); Iris_thickness_L_SMR (Iris thickness in the sphincter muscle region on left side); Iris_thickness_PL (Iris thickness contacts with plateau); Iris_thickness_R_DMR (Thickness of the iris region in the dilator muscle region on right side); Iris_thickness_R_SMR (Iris thickness in the sphincter muscle region on right side); Lens vault (The perpendicular distance between the horizontal line joining the two scleral spurs and the anterior pole of the crystalline lens, represents the anterior portion of the lens); Pupil_distance (Distance between the centers of the pupils).

The experimental investigation of our proposed method was conducted and implemented using Matlab 8.0 R2012b (The Mathworks Inc., Natick, MA, USA) and Microsoft Visual Studio (C++). All the 84 features are normalized to have zero mean and unity variance. Wirawan et al. has shown that Adaboost performs better than SVM in application of ACG classification, and the performance of Adaboost combined with OVA strategy is also better than traditional multiclass classifiers in terms of classification accuracy, such as classification tree, and Naive Bayes [5]. WEKA (Waikato Environment for Knowledge Analysis) data mining tool [26] was used for comparing the proposed method with the traditional multiclass classifiers based on its default parameters as same as reported in [5] which used the same dataset for their study. Thus Adaboost is the choice of binary classifier in this experiment. Besides, for fair comparison with the results in [5], we also used mRMR for feature selection. In this four-class classification problem, there are totally 25 possible dichotomizers, which are easily obtained by exhaustive search. In the first step of the proposed method, all the 84 features are ranked according to the mRMR criteria for each dichotomizer. Each feature is incrementally included into a ranking list according to the following equations,

$$ { \max}_{{\mathrm{f}}_{\mathrm{j}}\in \mathrm{F}-{\mathrm{F}}_{\mathrm{m}-1}}\left[\mathrm{I}\left({\mathrm{f}}_{\mathrm{j}},\mathrm{y}\right)-\frac{1}{\mathrm{m}-1}{\displaystyle {\sum}_{{\mathrm{f}}_{\mathrm{j}}\in {\mathrm{F}}_{\mathrm{m}-1}}\mathrm{I}\left({\mathrm{f}}_{\mathrm{j}},{\mathrm{f}}_{\mathrm{i}}\right)}\right] $$

(6)

or

$$ \max {}_{{\mathrm{f}}_{\mathrm{j}}\in \mathrm{F}-{\mathrm{F}}_{\mathrm{m}-1}}\left[\mathrm{I}\left({\mathrm{f}}_{\mathrm{j}},\mathrm{y}\right)/\left(\frac{1}{\mathrm{m}-1}{\displaystyle {\sum}_{{\mathrm{f}}_{\mathrm{j}}\in {\mathrm{F}}_{\mathrm{m}-1}}\mathrm{I}\left({\mathrm{f}}_{\mathrm{j}},{\mathrm{f}}_{\mathrm{i}}\right)}\right)\right] $$

(7)

where F is the whole feature set, and F _m-1 is the set of (m-1) features that are already selected, by maximizing the mutual information I(f _j,y) between the jth feature and the class label y and minimizing the mutual information I(f _j, f _i) between the jth feature and the feature f _i in the set F _m-1 [23]. To further increase the classification ability of each dichotomizer, a wrapper approach is used to select the best feature set which leads to the least classification error. The dichotomizers are ranked according to their classification accuracy in descending order as shown in Table 4. In the Step 2, for λ _(p-1) ∊ [0, 600] with a step of 30 and n _q ∊ [1, 9] with a step of 1 (p =0, 1, 2,…, 20; q =1, 2,…9), the best set of dichotomizers B _p,q is determined by maximizing d _s ^” in Eq.(5) where L(z) = 1/e ^z. The separability d _s ^” is dominantly determined by the average classification accuracy of the selected dichotomizer when the weighting coefficient λ = 600. According to the suggestion of [17, 20], the code length n should be 15log(c) ≈ 9. Here let n vary from 1 to 9.

Table 4 The Ranking of all the Dichotomizers according to their classification accuracy on the training dataset

Full size table

The classification performance of these different sets of dichotomizers {B _p,q} is evaluated on the training dataset by leave-one-out cross-validation (LOOCV). Since we have done our study with the available limited data sources, leave-one-out cross-validation is used to prevent over-fitting of training data. The best set of dichotomizers is determined to be B _13,6 = {D1, D3, D4, D5, D6, D7}, and the optimal values of the parameters are λ ^* = 360 and n ^* = 6. The ECOC code matrix formed by the selected set of dichotomizersB _13,6 is shown in Table 5. Loss-based decoding strategy is used in the decoding process. The loss function of Adaboost classifier is L(z) = 1/e ^z. The confusion matrix obtained using LOOCV on the glaucoma dataset is shown in Table 6. The weighted average classification accuracy is 87.65 % as shown in Table 7, which is better than the accuracy of 84 % obtained by [5] (The dataset in [5] is slightly different from ours in that four more patients with no mechanism of glaucoma are added; however, this effect is negligible).

Table 5 The ECOC code matrix determined in the proposed method

Full size table

Table 6 Confusion matrix obtained by using the proposed method (Leave-one-out cross-validation)

Full size table

Table 7 Comparison of classification accuracy of the proposed method with other ECOC-based methods with dichotomizer-specific feature selection

Full size table

Comparison with other ECOC methods, including OVO, OVA, sparse random ECOC

In most traditional methods, the dichotomizers are not optimized specifically. Only Wang et al. and Maghsoudi et al. used feature selection to optimize the dichotomizers in OVA scheme [14, 15]. In this paper, we apply feature selection to each dichotomizer for all the three ECOC methods based on OVO, OVA [9, 20] and sparse random [21] strategies. The classification accuracy for each class and the weighted average accuracy of these three existing popular ECOC methods are shown in Table 7. The highest weighted average accuracy of the three ECOC methods is 85.81 %, better than the traditional multiclass classifiers, such as classification tree (72.22 %), random forest (76.58 %), SVM combined with OVA strategy (78.22 %) and Naive Bayes (77.93 %).

We also randomly select 80 % of the dataset for training and the other 20 % for testing to compare the proposed method with the other ECOC methods. This process is repeated 2000 times. The classification accuracy (mean value ± standard deviation) of the proposed method and the three other ECOC methods mentioned above is 84.86 ± 3.56 %, 83.69 ± 3.75 %, 79.76 ± 3.75 %, and 81.45 ± 3.70 %, respectively. And the histograms of the classification accuracy for the proposed method and the three ECOC methods are shown in Fig. 5a–d, respectively, from which we can see that the proposed method performs best. In the proposed method, all the dichotomizers are first optimized specifically to increase the interdependence and classification accuracy, and then the ECOC code matrix is learned by maximizing Eq.(5) and selecting a set of competitive dichotomizers. Not only the code information but also the classification ability of the dichotomizers is considered in maximizing the separability of the codewords in the ECOC matrix. However, in most traditional ECOC methods, the dichotomizers are not selected in consideration of their classification performance.

In this experiment, three dichotomizer sets, namely {D1, D2, D5, D6, D18, D23}, {D4, D12, D16, D17}, {D6, D7, D8, D9, D10, D11, D12, D13, D14} are used in the three ECOC methods based on OVO, OVA and Sparse random strategies, respectively (the details of the dichotomizers are shown in Table 3). Due to the dichotomizers with low accuracy such as {D18, D23, D16, D17, D13, D14} are incorporated in the code matrix of these three methods, the final performance are deteriorated accordingly. The performance of OVO is relatively better than OVA and sparse random based ECOC methods, because only two relatively inaccurate dichotomizers {D18, D23} are included and the others have very high classification accuracy. In the proposed method, the dichotomizers, namely {D1, D3, D4, D5, D6, and D7}, all with high accuracy, are selected which ensures the performance is better than that of the other traditional ECOC methods.

Conclusions

Angle closure glaucoma is a prevalent eye disease worldwide, especially in Asia. There are four different mechanisms of ACG, which need different clinical treatments accordingly. Therefore, classification of these four mechanisms is important in automatic diagnosis of glaucoma. In this paper, a new ECOC based ensemble learning method is proposed for multiclass classification, with application to classification of four mechanisms of ACG. In the proposed method, for each possible dichotomizer, its best feature set is determined and classification accuracy is obtained by using cross-validation on the training glaucoma dataset. The dichotomizers are selected based on the maximization of both the separability of the codewords in the ECOC matrix and the classification ability of the dichotomizers. The selected dichotomizers are included into the ECOC framework. The proposed method has been experimentally applied on a glaucoma dataset including 152 patients with four different mechanisms. The classification accuracy is experimentally validated to be better than that of the other three existing ECOC methods.

There are two points make the proposed method perform better than the others: 1) the dichotomizers are optimized individually and their binary classification abilities are quantified prior to dichotomizer selection. The classification accuracy and diversity of the dichotomizers in the ECOC framework are improved by using different optimal feature set for each dichotomizer; 2) the ECOC code matrix in which the dichotomizers are all competitive with high binary classification performance and the codewords are all separated largely is determined to ensure the final classification performance. The proposed method is promising to be applied to automatic classification of different ACG mechanisms and helps doctors to make specific treatment for each mechanism.

References

Shabana, N., Aquino, M. C., See, J., Tan, A. M., Nolan, W. P., Hitchings, R., Young, S. M., Loon, S. C., Sng, C. C., Wong, W., and Chew, P. T., Quantitative evaluation of anterior chamber parameters using anterior segment optical coherence tomography in primary angle closure mechanisms. Clin. Exp. Ophthalmol. 40:792–801, 2012.
Article Google Scholar
Ritch, R., Tham, C., and Lam, D., Long-term success of argon laser peripheral iridoplasty in the management of plateau iris syndrome. Ophthalmology 111:104–108, 2004.
Article PubMed Google Scholar
Tian, J., Marziliano, P., Baskaran, M., Wong, H.-T., and Aung, T., Automatic anterior chamber angle assessment for HD-OCT images. IEEE Trans. Biomed. Eng. 58:3242–3249, 2011.
Article PubMed Google Scholar
Nongpiur, M. E., He, M., Amerasinghe, N., Friedman, D. S., Tay, W. T., Baskaran, M., Smith, S. D., Wong, T. Y., and Aung, T., Lens vault, thickness, and position in Chinese subjects with angle closure. Ophthalmology 18:474–479, 2011.
Article Google Scholar
Wirawan, A., Kwoh, C. K., Chew, P. T. K., Aquino, M. C. D., Seng, C. L., See, J., Zheng, C., and Lin, W., Feature selection for computer-aided angle closure glaucoma mechanism detection. J. Med. Imag. Health Inform. 2(4):438–444, 2012.
Article Google Scholar
Nayak, J., Acharya, U. R., Bhat, P. S., Shetty, A., and Lim, T. C., Automated diagnosis of glaucoma using digital fundus images. J. Med. Syst. 33(5):337–346, 2009.
Article PubMed Google Scholar
Balasubramanian, M., Zabic, S., Bowd, C., Thompson, H. W., Wolenski, P., Iyengar, S. S., Karki, B. B., and Zangwill, L. M., A framework for detecting glaucomatous progression in the optic nerve head of an eye using proper orthogonal decomposition. IEEE Trans. Inf. Technol. Biomed. 13(5):781–793, 2009.
Article PubMed PubMed Central Google Scholar
Huang, M. L., Chen, H. Y., and Huan, J. J., Glaucoma detection using adaptive neuro-fuzzy inference system. Expert Syst. Appl. 32(2):458–468, 2007.
Article Google Scholar
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., and Herrera, F., An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recogn. 44:1761–1776, 2011.
Article Google Scholar
Garcıa-Pedrajas, N., and Ortiz-Boyer, D., Improving multiclass pattern recognition by the combination of two Strategies. IEEE Trans. Pattern Anal. Mach. Intell. 28:1001–1006, 2006.
Article PubMed Google Scholar
Dietterich, T. G., and Bakiri, G., Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2:263–286, 1995.
Google Scholar
Allwein, E. L., Schapire, R. E., and Singer, Y., Reducing multiclass to binary: A unifying approach for margin classifiers. J. Mach. Learn. Res. 1:113–141, 2001.
Google Scholar
Escalera, S., Pujol, O., and Radeva, P., On the decoding process in ternary error-correcting output codes. IEEE Trans. Pattern Anal. Mach. Intell. 32:120–134, 2010.
Article PubMed Google Scholar
Wang, L., Zhou, N., and Chu, F., A general wrapper approach to selection of class-dependent features. IEEE Trans. Neural Netw. 19:1267–1278, 2008.
Maghsoudi, Y., Zoej, M. J. V., and Collins, M., Using class-based feature selection for the classification of hyperspectral data. Int. J. Remote Sens. 32:4311–4326, 2011.
Article Google Scholar
Bagheri, M. A., Gao, Q., and Escalera, S., A genetic-based subspace analysis method for improving Error-Correcting Output Coding. Pattern Recogn. 46:2830–2839, 2013.
Article Google Scholar
Pujol, O., Radeva, P., and Vitria, J., Discriminant ECOC: A heuristic method for application dependent design of error correcting output codes. IEEE Trans. Pattern Anal. Mach. Intell. 28:1007–1012, 2006.
Article PubMed Google Scholar
Escalera, S., Pujol, O., and Radeva, P., Error-correcting ouput codes library. J. Mach. Learn. Res. 11:661–664, 2010.
Google Scholar
Zhou, J., Peng, H., and Suen, C. Y., Data-driven decomposition for multi-class classification. Pattern Recogn. 41:67–76, 2008.
Article Google Scholar
Rocha, A., and Goldenstein, S. K., Multiclass from binary: Expanding one-versus-all, one-versus-one and ECOC-based approaches. IEEE Trans. Neural Netw. Learn. Syst. 25:289–302, 2014.
Article PubMed Google Scholar
Escalera, S., Pujol, O., and Radeva, P., Separability of ternary codes for sparse designs of error correcting output codes. Pattern Recogn. Lett. 30:285–297, 2009.
Article Google Scholar
Saeys, Y., Inza, I., and Larranaga, P., A review of feature selection techniques in bioinformatics. Bioinfomatics 23:2507–2517, 2007.
Article CAS Google Scholar
Peng, H., Long, F., and Ding, C., Feature selection based on mutual information criteria of maxdependency, max-relevance, and min- redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27:1226–1238, 2005.
Article PubMed Google Scholar
Issac Niwas, S., Lin, W., Bai, X., Kwoh, C. K., Sng, C. C., Aquino, M. C., and Chew, P. T. K., Reliable feature selection for automated angle closure glaucoma mechanism detection. J. Med. Syst. 39:21, 2015.
Article PubMed Google Scholar
Issac Niwas, S., Lin, W., Kwoh, C. K., Kuo, C. C., Aquino, M. C., Sng, C. C., and Chew, P. T. K., Crossexamination for angle-closure glaucoma feature detection. IEEE J. Biomed. Health Informatics 20:343–354, 2016.
Article Google Scholar
Waikato Environment for Knowledge Analysis: University of Waikato New Zealand. Available from: http://www.cs.waikato.ac.nz/ml/weka/index.html

Download references

Acknowledgments

This work was supported by Ministry of Education (MoE) AcRF Tire 1 Funding, Singapore, under Grant M4010981.020 RG36/11.

Author information

Authors and Affiliations

State Key Laboratory of Fluid Power Transmission and Control, Zhejiang University, Hangzhou, 310027, People’s Republic of China
Xiaolong Bai & Bing-Feng Ju
School of Electrical and Electronics Engineering, Nanyang Technological University (NTU), Singapore, 639798, Singapore
Xiaolong Bai & Lipo Wang
School of Electrical and Electronics Engineering & School of Computer Engineering, Nanyang Technological University (NTU), Singapore, 639798, Singapore
Swamidoss Issac Niwas
School of Computer Engineering, Nanyang Technological University (NTU), Singapore, 639798, Singapore
Weisi Lin & Chee Keong Kwoh
Eye Surgery Centre, National University Health System (NUHS), Singapore, 119228, Singapore
Maria C. Aquino
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore (NUS), Singapore, 119228, Singapore
Chelvin C. Sng & Paul T. K. Chew

Authors

Xiaolong Bai
View author publications
You can also search for this author in PubMed Google Scholar
Swamidoss Issac Niwas
View author publications
You can also search for this author in PubMed Google Scholar
Weisi Lin
View author publications
You can also search for this author in PubMed Google Scholar
Bing-Feng Ju
View author publications
You can also search for this author in PubMed Google Scholar
Chee Keong Kwoh
View author publications
You can also search for this author in PubMed Google Scholar
Lipo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chelvin C. Sng
View author publications
You can also search for this author in PubMed Google Scholar
Maria C. Aquino
View author publications
You can also search for this author in PubMed Google Scholar
Paul T. K. Chew
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Swamidoss Issac Niwas.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Additional information

This article is part of the Topical Collection on Patient Facing Systems

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bai, X., Niwas, S.I., Lin, W. et al. Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis. J Med Syst 40, 78 (2016). https://doi.org/10.1007/s10916-016-0436-2

Download citation

Received: 15 August 2015
Accepted: 07 January 2016
Published: 21 January 2016
DOI: https://doi.org/10.1007/s10916-016-0436-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis

Abstract

Similar content being viewed by others

Reliable Feature Selection for Automated Angle Closure Glaucoma Mechanism Detection

An Ensemble Framework for Glaucoma Classification Using Fundus Images

Using Deep Features Extraction and Ensemble Classifiers to Detect Glaucoma from Fundus Images

Introduction

ECOC framework

Method for multiclass classification

Step 1: Feature set optimization for each dichotomizer

Step 2: Maximization of the separability of codewords in ECOC framework in consideration of the performance of each dichotomizer