Introduction

World Health Organization has declared glaucoma to be the second largest cause of blindness all over the world and it encompasses 15 % of the blindness cases in the world. This makes 5.2 million of the world’s population [1] and the number is expected to increase upto 80 million by 2020 [2]. The disease is associated with the malfunctioning of the eye’s drainage system which creates a high intra-ocular pressure which affects the components of the optic nerve. Glaucoma starts with loss of peripheral vision and therefore remains unnoticed in its early stages. However, it can result in permanent blindness if left untreated [2]. Since the disease is irreversible, a screening system is therefore critically required to detect the disease in its early phase and ensure no further loss of vision by curing the drainage structure of eye. One of the indications of glaucoma is the thinning of neuro retinal rim (NRR) and growing of the optic cup hence increasing the cup to disc ratio (CDR). Clinical analysis of glaucoma depends on the CDR value and a CDR of 0.65 or above is declared as glaucomatous [3]. Figure 1 shows fundus images for normal and glaucomatous eye with normal and abnormal optic nerves respectively.

Fig. 1
figure 1

Normal and glaucomatous eye a normal optic nerve, b abnormal optic nerve with thin NRR and large cup

A number of methods have been presented for automated detection of different retinal diseases and optic disc [4]–[6]. Wong et al. [7] used an intelligent fusion based approach for CDR calculation. After localizing the region of interest (ROI), a variational set approach is applied for detection of optic disc (OD) and histogram based intensity analysis is done for segmentation of optic cup. They employed support vector machine (SVM) and neural networks (NN) for classification where SVM performed a little better than NN. A system for glaucoma detection based on High Order Spectra (HOS) features and texture based predictors was proposed in [8]. SVM , Naïve Bayes and Random forest classifier have been used for supervised classification. The techniques were applied on 60 images obtained from local database and achieved accuracy of 91 %. A variation set methodology was applied on the red component of the color image for the segregation of optic disc and vertical cup to disc ratio was computed by Zhang et al. [9]. This system was applied on 1564 images collected from Singapore Eye Research Institute(SERI) and achieved accuracy of 96 %. Sagar et al. [10] presented an automated technique for the segregation of optic disc and cup. Local image information for each point of interest in ROI is generated which was added as a feature to cater for the challenges like image variations near or in ROI. The cup boundary was also segregated using the vessel bending indicator (r bends strategy) at the boundary of cup. Spline fitting technique is applied to crop the boundary and system detected glaucoma with an accuracy of 90 % on 138 images. Recently, a review on glaucoma detection technique is done by Haleem et al. [11]. It was highlighted that the inclusion of some more signs like vascular and shape based changes in OD can be considered along with CDR while detection glaucoma from fundus images. They have also concluded that although there are a number of signs which can be seen in digital fundus images, the use of OCT images is still important for reliable and timely detection of glaucoma.

There are still open issues in automated glaucoma detection which include correct localization of OD even in the presence of other pathologies and a detailed feature set for true representation of glaucoma other than just CDR or RDR. The proposed system addresses these issues and presents a novel robust system for OD localization which works even in the presence of noise and other retinal pathologies. The second contribution of this research is the construction of a detailed hybrid feature set which combines traditional RDR and CDR with spatial and spectral features to make a novel yet comprehensive feature set. The proposed system also contains a novel classifier with the capabilities of handling anomalies and multivariate distributions. Another major contribution is in generation of an annotated database purely for flaucoma detection which can be used by other researchers for evaluation of their methods. We have made this database available on request on our group web page [12].

This article comprises of four sections. "Proposed system" section contains proposed methodology followed by material and experimental results "Experimental results" section. The conclusion and discussion are given "Discussion" section.

Proposed system

Glaucoma is a pathological condition of optic nerve damage and is the second leading cause of vision loss. It is known as silent thief of sight. In this article, we present a computer aided diagnostic system for early and automated detection of glaucoma using a novel feature set. The proposed system consists of four main modules including preprocessing, region of interest detection based on automatically localized OD, feature extraction and finally classification. Figure 2 shows the flow diagram of the proposed system which summarizes the steps of the screening system developed for glaucoma detection.

Fig. 2
figure 2

Flow diagram of proposed system

Optic disc localization

Optic disc (OD) localization is first step of computer aided diagnostic system for glaucoma detection. A number of methods for OD localization and detection have been presented [13]–[20] but the main limitation is OD localization in the presence of other bright lesions and noise. In the proposed system, we employ our previously presented method for robust OD localization [21]. It consists of a novel robust OD localization method with the capabilities of true localization even in presence of large bright lesions and noise. This method is based on the assumption that OD normally has bright intensities and secondly blood vessels originates from OD resulting in highest vessel density in OD. In the proposed system, vascular pattern is enhanced by using 2-D Gabor wavelet for their better visibility followed by thresholding based vessel segmentation. In order to increase the accuracy of vessel segmentation, the proposed system uses multilayered thresholding approach to ensure the extraction of small vessel along with the large ones [22]. The next step is to extract candidate bright regions using red channel of colored image or a weighted gray scaled version of colored image. The decision for this selection is based on the saturation level of red channel. We use red channel if mean intensity value of red channel is less than an empirical values of 220. Due to circular shape of OD, an inverted Laplacian of Gaussian (LoG) filter is used to enhance the high intensity regions in the fundus image [21]. The size of LoG filter is set equal to the size of image. This is estimated by specifying σ = 53 which is calculated empirically using all datasets. The bright candidate regions are extracted by selecting pixels having top 60 % response from the LoG filtered image. A window of 100 × 50 is placed on center of every candidate region and vessel density is computed by counting the ON pixels within window from already segmented vascular map. OD position is localized finally by selecting the region which has the highest vessel density [21]. Figure 3 shows the step wise output for OD localization method.

Fig. 3
figure 3

From left to right and top to bottom: original fundus image; gabor wavelet based enhanced image; vascular mask using multilayered thresholding [22]; red channel of original image; spatial domain LoG filter with sigma = 53; filtered image after applying LoG filter on red channel; candidate regions after application of threshold; region with highest blood vessel density is marked as OD; region of interest extracted from localized OD

Feature extraction

Once OD is localized, the region of interest (ROI) containing OD and its neighboring pixels is then extracted. Average diameter (D) of OD for whole database is measured and a region of 2D × 2D is extracted from original input image. After extracting the ROI, a number of features are extracted to generate comprehensive feature space representation and accurate classification. The proposed system represents each ROI with a hybrid feature set consisting of following features.

Cup to disc ratio CDR (f1)

CDR, which is the ratio of diameter of optic cup to disc, is the most commonly used indicator of glaucoma. A CDR value of less than or equal to 0.5 is considered as normal whereas the value above 0.5 is an indicator of a glaucoma suspect. Any value above 0.64 shows a high risk of glaucoma [23]. In the proposed system, vertical cup to disc ratio is computed as shown in Fig. 4.

Fig. 4
figure 4

Vertical Cup To Disc Ratio

The disc is extracted from colored image by applying morphological closing [24] on the ROI image to suppress the blood vessel. This gives a smooth fundus region ϕ f having bright regions only. The result of morphological closing is converted to gray scale and adaptive thresholding is applied to convert it into binary image. Canny edge detection is applied on the binary image to get the boundary of optic disc.

For the extraction of cup, green plane of colored ROI is used. As the cup is much brighter in contrast as compare to disc, it is extracted by applying a relatively higher threshold than the one which is used for extraction of disc. The threshold for the cup is specified in such a way that it selects pixels having top 20 % intensity values in ROI. Canny edge filter is applied in order to get the boundary of cup. The areas of both cup and disc are measured and used in the calculation of CDR.

Rim to disc ratio RDR (f 2 )

The Neuroretinal rim between disc and cup boundary is of significant importance. The increased IntraOcular Pressure (IOP) results in thinning of the Neuroretinal rim between optic cup and optic disc. In the proposed system, vertical RDR has also been added in feature set as it is an indicator for glaucoma [23]. In computing RDR, the ratio of thickness of NRR and the disc diameter is computed. Figure 5 shows the NNR thickness against disc diameter.

Fig. 5
figure 5

Vertical rim to disc ratio

Unlike CDR value, a decrease in the value of RDR indicates an effected eye. The maximum value of CDR can go up to 1 if the cup becomes exactly of the size of disc but RDR value cannot exceed 0.5 [23]. A healthy eye can have RDR up to 0.45 whereas an RDR value of 0.1–0.19 is considered as glaucoma suspect. Any value of RDR which is less than 0.1 is an indicator of glaucoma [3], [25]. In the proposed system, vertical ratios of Rim to Disc diameters have been employed to classify the images as normal or glaucomatous.

Spatial features (f 3 − f 7)

The use of spatial features are motivated by the fact that area of cup changes from normal to glaucomatous eye and intensity of cup is normally higher than the intensity of disc. The proposed spatial features are based on the intensity values of red channel of ROI. Before extraction of spatial features, contrast enhancement is applied on red channel of ROI image to improve the contrast of optic disc and optic cup for easy detection of glaucoma using a w × w sliding window. Here an assumption is made that w is large enough to contain a statistically representative distribution of cup as given in Eq. 1

$$\begin{aligned} g = 255\frac{[\Phi _\omega (\phi _f) - \Phi _\omega (\phi _{fmin})]}{[\Phi _\omega (\phi _{fmax}) - \Phi _\omega (\phi _{fmin})]} \end{aligned}$$
(1)

where \(\Phi _\omega\) is the sigmoidal function for a window defined as

$$\begin{aligned} {\Phi _\omega (\phi _f)} = \left[1+exp\left(\frac{m_\omega -\phi _f}{\sigma _\omega }\right)\right]^{-1} \end{aligned}$$
(2)

\(\Phi _{fmax}\), \(\Phi _{fmin}\) are the maximum and minimum intensity values of red channel image respectively. \(m_\omega\) and \(\sigma _\omega\) are the mean and variance of intensity values within the window. The spatial features are:

Mean intensity \((f_3)\) It is the mean intensity value of enhanced red channel pixels. A glaucomatous image contains a larger cup than that of a normal image therefore the number of pixels with higher intensity values is also greater in abnormal case.

Standard Deviation \((f_4)\) It is the standard deviation value of enhanced red channel pixels which shows the spread of intensity values.

Energy \((f_5)\) Energy of an image is the sum of squares of all pixel intensities. Glaucomatous images have a higher energy because they contain more brighter parts as compared to normal images.

Mean and standard deviation of gradient magnitude \((f_6, f_7)\) Gradient magnitude for enhanced red channel is computed using sobel operator [24]. The mean and standard deviation of this gradient magnitude image are computed.

Spectral features \((f_8, f_9, f_{10})\)

The spectral features are based on higher order spectra (HOS) to have more reliable information related to image [8]. The bispectrum \(X_{BS}\) of a signal x(nT) for frequencies \(\omega _1\) and \(\omega _2\) is given by Eq. 3

$$\begin{aligned} X_{BS}(\omega _1,\omega _2) = E\big [X(\omega _1)X(\omega _2)X^*(\omega _1+\omega _2)\big ] \end{aligned}$$
(3)

where \(X(\omega )\) is the Fourier transform, \(^*\) denotes the conjugate of Fourier transform and \(E[\cdot ]\) is the expectation operation. Spectral features consist of mean magnitude of bispectrum \((Mag_{BS})\), normalized bispectral entropy \((NE_{BS})\) and normalized bispectral squared entropy \((NE_{BS^2})\). Their mathematical expressions are given in Eqs. 46 [8].

$$\begin{aligned} Mag_{BS} = \frac{1}{N}\sum _\Omega |X_{BS}(\omega _1,\omega _2)| \end{aligned}$$
(4)
$$\begin{aligned} NE_{BS} = -\sum _n p_n log p_n \end{aligned}$$
(5)
$$\begin{aligned} NE_{BS^2} = -\sum _n q_n log q_n \end{aligned}$$
(6)

where \(p_n = (|X_{BS}(\omega _1,\omega _2)|)/\sum _\Omega |X_{BS}(\omega _1,\omega _2)|)\) and \(q_n = (|X_{BS}(\omega _1,\omega _2)|^2)/(\sum _\Omega |X_{BS}(\omega _1,\omega _2)|^2)\).

The novelty of feature extraction phase lies in generation of a feature vector consisting of hybrid features representing different aspects of OD to discriminate glaucomatous OD from normal. Traditional features such as CDR and RDR contribute mostly in achieving high accuracy but combining them with the other descriptors results in even better accuracies. We have performed Wilcoxon and Ansari-Bradlay rank tests to give a more insight about the effectiveness of individual features. Table 1 shows the performance evaluation of features using these rank tests. The features are arranged in descending order of their absolute scores. Although last few features can be excluded due to low scores in one test but as they give acceptable scores in other test, so we have not performed feature selection here.

Table 1 Performance of all features calculated using Wilcoxon and Ansari-Bradley tests for glaucoma detection

Multivariate m-mediods based modeling and classification of glaucoma

The feature extraction phase extracts different features from ROI of each fundus image and represents it in form of a feature vector F

$$\begin{aligned} F = \{f_1,f_2,f_3,f_4,f_5,f_6,f_7,f_8,f_9,f_{10}\} \end{aligned}$$

Given the feature vector representation of optic nerve head, we now present our proposed classification approach for classification of retinal image as normal or suffering from glaucoma. The proposed approach transforms feature vector representation by employing supervised transformation to increase inter-class distance whilst decreasing intra-class distance. We employ Local Fisher discriminant analysis (LFDA) to perform supervised enhancement of features. LFDA identifies principal components in the original feature space that results in maximized discrimination between different classes. More formally, let \(DB=\{F_{1},F_{2},..,{F_{n}}\}\) be a set of n training samples belonging to the two classes of retinal images {normal retinal image, glaucoma}. The between class and within class scatter matrices are computed as:

$$\aleph _{b} = \frac{1}{2}\mathop \sum \limits_{{i = 1}}^{n} \mathop \sum \limits_{{j = 1}}^{n} \Im _{{i,j}}^{b} \left( {\left\| {F_{i} ,F_{j} } \right\|} \right)$$
(7)
$$\aleph _{w} = \frac{1}{2}\mathop \sum \limits_{{i = 1}}^{n}\mathop \sum \limits_{{j = 1}}^{n} \Im _{{i,j}}^{w} (F_{i} ,F_{j})$$
(8)

where \(\Vert .,.\Vert\) is a Euclidean distance function and

$$\Im _{{i,j}}^{w} = \left\{ {\begin{array}{*{20}c} {\exp \left( {\frac{{\parallel F_{i} ,F_{j} \parallel ^{2} }}{{\varsigma _{i} \varsigma _{j} }}} \right)*\frac{1}{{n_{k} }}} & {iff{\kern 1pt} F_{i} \Lambda F_{j} \in C_{k} } \\ 0 & {otherwise} \\ \end{array} } \right.$$
(9)
$$\Im _{{i,j}}^{b} = \left\{ {\begin{array}{*{20}c} {\exp \left( {\frac{{\parallel F_{i} ,F_{j} \parallel ^{2} }}{{\varsigma _{i} \varsigma _{j} }}} \right)*\left( {\frac{1}{n} - \frac{1}{{n_{k} }}} \right)} & {iff\,F_{i} \Lambda F_{j} \in C_{k} } \\ {\frac{1}{n}} & {otherwise} \\ \end{array} } \right.$$
(10)

Here, \(n_{k}\) is the membership count of class \(\mathbf {C_{k}}\) and \(\varsigma _{i}\) is the average distance of sample \(F_{i}\) with its k nearest neighbors. We set \(k=5\) based on empirical evaluation. This local scaling of distance between two samples affinity matrix is critical in handling variation of distribution of samples within a given pattern. Eigenvalue decomposition of \(\aleph _{b}E=\lambda \aleph _{w} E\) is then performed where \(\lambda\) is a generalized eigenvalue and E is the corresponding eigenvector. LFDA-transformed enhanced feature space representation of optic nerve head in retinal image is then computed as:

$$\begin{aligned} F=\{E_{1},E_{2},..,E_{m}\} \end{aligned}$$
(11)

where \(\{E_{1},E_{2},..,E_{m}\}\) are eigenvectors arranged in descending order w.r.t. their corresponding eigenvalues \(\{\lambda _{1}, \lambda _{2},.., \lambda _{m}\}\).

The enhanced feature space representation of optic nerve head is now used to generate models of normal retinal images and retinal imaged affected by glaucoma. As we expect to have variation in the number and distribution of samples within these two classes, we employ multivariate m-Mediods based modeling and classification approach to handle multimodal distribution of samples within a modeled pattern [26]. Modeling of patterns using the proposed approach is comprised of three steps. In the first step, the quantized representation of training samples, referred to as mediods, are generated. Our approach tends to identify mediods in a fashion that the number of identified mediods in different parts of the distribution is proportional to the density of samples. We present an extension of self organizing maps (SOM) based learning approach to identify mediods. Let \(DB^{(i)}\) be the enhanced feature space representation of training data belonging to class i and W the weight vector associated to each output neuron. The SOM network is initialized with a greater number of output neurons than the desired number of mediods m. The value of \(\#_{output}\) is determined empirically as:

$$\#_{output}=\left\{ \begin{array}{ll}\xi &{}if \quad \xi < 150 \,\,\wedge \xi > (m\times 2)\\ m\times 2&{} if \quad \xi <(m\times 2)\\ 150&{} if \quad \xi > 150 \end{array} \right.$$
(12)

where \(\xi =size(DB^{(i)})/2\). The weight vector representation of output neurons \(W_i\) (where \(1\le i\le \#_{output}\)) are then initialized from the probability density function (PDF) \(N(\mu ,\Sigma )\) estimated from training samples in \(DB^{(i)}\). The enhanced feature space representation of glaucoma are sequentially input to train the network. Identify k Nearest Weights (k-NW) to current training sample F using:

$$\begin{array}{c} k-NW (F,\mathbf {W},k)=\{C \in \mathbf {W} | \forall R \in C, S \in \mathbf{W}-C,\\ \qquad \qquad \qquad \qquad \qquad \,\, \Vert F,R\Vert \le \Vert F,S\Vert \wedge |C|=k \} \end{array}$$
(13)

where \(\mathbf {W}\) is the set of all weight vectors, C is the set of k closest weight vectors, \(\Vert ,.,\Vert\) is the Euclidean distance function and \(k=\delta (t)\) where \(\delta (t)\) is a neighborhood size function whose value decreases gradually over time as specified in Eq. (16). Network is trained by updating a subset of the weights (C) using

$$\begin{aligned} W_c(t+1)=W_c(t)+\alpha (t)\zeta (j)(F-W_c(t))\,\,\,\,\,\forall W_c\in C \end{aligned}$$
(14)

where \(W_c\) is the weight vector representation of output neuron c, j is the order of closeness of \(W_c\) to F \((1\le j\le k)\), \(\zeta (j,k)=exp(-(j-1)^2/2k^2)\) is a membership function that has value 1 when \(j=1\) and falls off with the increase in the value of j, \(\alpha (t)\) is the learning rate of SOM and t is the training cycle index. The learning rate \(\alpha (t)\) and neighborhood size \(\delta (t)\) are decreased exponentially over time using:

$$\begin{aligned} \alpha (t)=1-e^{\frac{2(t-t_{max})}{t_{max}}} \end{aligned}$$
(15)
$$\begin{aligned} \delta (t)=\lceil \delta _{init}(1-e^{\frac{2(t-t_{max})}{t_{max}}})\rceil \end{aligned}$$
(16)

where \(t_{max}\) is the maximum number of training iterations and \(\delta _{init}\) is the initial neighborhood size whose value is empirically set to 5. This learning process is repeated for certain number of training iterations. Training samples are then assigned to their closest output neurons. Output neurons with no memberships are filtered as they are not representing and part of the normality distribution of a given class. As we have updated the network with greater number of output neurons as compared to the desired number of groupings, we merge the most similar output neurons (ij) (indexed by (ab)) as:

$$\begin{aligned} W_{ab}=\frac{|W_a|\times W_a+|W_b|\times W_b}{|W_a|+|W_b|} \end{aligned}$$
(17)

where |.| is the membership count function and the index of most similar output neurons is obtained as:

$$\begin{aligned} (a,b)=arg\,min_{(i,j)}\,\,[\Vert W_i,W_j\Vert \times (|W_i|+|W_j|)]^\frac{1}{2}\,\,\,\forall \,i,j\,\wedge \, i\ne j \end{aligned}$$
(18)

The output neurons are merged iteratively till the number of weight vectors gets equivalent to m. The weight vector \(\mathbf {W}\) is appended to the list of mediods \(\mathbf {M}^{(i)}\) modeling the pattern i. These set of mediods generated for different classes is now used for determining the customized normality ranges as specified in later section.

After the computation of m mediods to represent a give pattern c, we determine the set of possible normality ranges separately for each mediod in a given model \(\mathbf {M}^{(c)}\) representing class c. Customized selection of the set of possible normality ranges for each class will enable the mediods to have its own normality range dependant on the local distribution of samples around a particular mediod. It will resultantly enable our proposed modeling approach to cater for multivariate distribution of samples within a given class. More formally, let \(\mathbf {D}^{(c)}\) be the set of possible normality ranges for pattern c. Values of possible normality ranges for a given pattern c is determined by computing the k nearest mediods of a mediod \(M_p \in \mathbf {M}^{(c)}\) as:

$$\begin{array}{c} k-NW (M_p,\mathbf {M}^{(c)},k)=\{\mathbf {C} \in \mathbf {M}^{(c)} | \forall R \in \mathbf {C}, S \in \mathbf {M}^{(c)}-\mathbf {C},\\ \qquad \qquad \qquad \qquad \qquad \qquad \Vert M_p,R\Vert \le \Vert M_p,S\Vert \wedge |C|=k \} \end{array}$$
(19)

where C is the set of k closest mediods w.r.t. \(M_p\). Set of possible normality ranges \(\mathbf {D}^{(c)}\) is then updated as:

$$\begin{aligned} \mathbf {D}^{(c)} = \{\mathbf {D}^{(c)} \cup \Vert M_p,R\Vert \} \forall R \in \mathbf {C} \end{aligned}$$
(20)

The set of normality ranges is updated for \(\forall M_p \in \mathbf {M}^{(c)}\). The whole process is repeated to compute the set of mediods and their corresponding set of possible normality ranges for all the classes.

Once the mediods and set of possible normality ranges has been identified for all the patterns, we select customized normality range \(\wp\) for each mediod depending on the distribution of samples from the same and different patterns around a given mediod. Instead of using all the training data to learn the customized normality range for each mediod, we employ only those training samples that lie in the neighborhood of a given mediod. The mediod memberships of training data is determined by sequentially inputting labeled training instances belonging to all classes and identifying the closest mediod, indexed by p, using:

$$\begin{aligned} p=arg\,\,min_{k}\,\,dist(F,M_j)\,\,\,\,\,\,\,\,\,\forall j \end{aligned}$$
(21)

where Q is the training sample. Let \(\mathbf {\Gamma ^{(c)}}_j\) represents the subset of training samples that have been identified as members of mediod \(M_j\) belonging to class c. We maintain a false positive (\(FP_j\)) and false negative (\(FN_j\)) statistics, corresponding to each mediod \(M_j\), which are initialized to 0. We sequentially input members of mediod \(M_j\) and update the \(FP_{j}^{k}\) and \(FN_{j}^{k}\) statistics, corresponding to possible values of normality ranges \(D_{k} \in \mathbf {D}^{(c)}\) as:

$$\begin{aligned} FP_{j}^{k} = FP_{j}^{k}+1\,\,\,\,\,\, iff \,\,\,\Vert M_{j},F\Vert <D_{k}\,\,\, \wedge \,\,\,L(M_{j})\ne L(F) \,\,\,\,\,\forall F \in \mathbf {\Gamma }_{j}, D_{k} \in \mathbf {D}^{(c)} \end{aligned}$$
(22)
$$\begin{aligned} FN_{j}^{k} = FN_{j}^{k}+1\,\,\,\,\,\, iff \,\,\,\Vert M_{j},F\Vert >D_{k}\,\,\, \wedge \,\,\,L(M_{j}) = L(F) \,\,\,\,\,\forall F \in \mathbf {\Gamma }_{j}, D_{k} \in \mathbf {D}^{(c)} \end{aligned}$$
(23)

where L(.) is a function that returns the label of a given mediod or sample. Customized range validity index \((\chi )\), to check the effectiveness of different possible normality ranges for a particular mediod, is then computed as:

$$\begin{aligned} \chi _{j}^{k}=\beta \times FP_{j}^{k} + (1-\beta ) \times FN_{j}^{k}\,\,\,\,\,\,\,\,\,\, 0\le \beta \le 1\,\,\,\,\,\,\forall k \end{aligned}$$
(24)

where \(\beta\) is a scaling parameter to adjust the sensitivity of proposed classifier to false positives and false negatives according to specific requirements. The index of normality range from \(\mathbf {D}^{(c)}\) giving the optimal value of \((\chi )\) for a given mediod \(M_j\) is identified as:

$$\begin{aligned} \imath =arg\,\,min_k\,\,\,\,\,\, \chi _{j}^{k} \end{aligned}$$
(25)

The customized normality range \(\wp _j\) corresponding to mediod \(M_j\) is then identified as:

$$\begin{aligned} \wp _{j} = D_{\imath } \end{aligned}$$
(26)

Once the mediods and their customized normality ranges are identified, the learned m-Mediods model of normality is used to classify unseen retinal images as normal or affected with glaucoma. The classification of feature vector representation of optic nerve head of unseen retinal images, using learned multivariate m-Mediods model, is performed by identifying k nearest mediods, from the entire set of mediods (\(\mathbf {M}\)), w.r.t. query Q as:

$$\begin{array}{ll} k-NM (Q,\mathbf {M},k)=\{\mathbf {C} \in \mathbf {M} | \forall R \in \mathbf {C}, S \in \mathbf {M}-\mathbf {C},\\ \qquad \qquad \qquad \qquad \qquad \quad \,\, \Vert Q,R\Vert \le \Vert Q,S\Vert \wedge |C|=k \} \end{array}$$
(27)
Fig. 6
figure 6

Depiction of the working of proposed multivariate m-Mediods based modeling and classification approach a mediods superimposed on training samples, b computation of possible normality ranges for each class, c customized normality regions of patterns identified using proposed approach

The test sample is now tested w.r.t. all the mediods from the set of k nearest mediods starting from closest to farthest mediod. Let \(\imath\) be the index of nearest mediod in the k-NM result (initialized to 1) and r and c be the index \(\imath {th}\) nearest mediod and its corresponding class respectively. Test sample Q is classified to class c if:

$$\begin{aligned} Dist(Q,M_r)\le \wp _{r} \end{aligned}$$
(28)

If the condition specified in Eq. (28) is not satisfied, we increment the index \(\imath\) by 1 to test sample Q w.r.t. the next nearest neighbor. This process is repeated till \(\imath =k\). If the test sample Q has not been identified as a valid member of any class, it is assigned to the class with highest number of members in the k-NM result as obtained in Eq. (27).

Visualization of proposed multivariate m-Mediods based modeling and classification is presented in Fig. 6. Each point in Fig. 6a depicts a feature vector representation of glaucoma disease. Same color and marker are used to represent samples belonging to same class. The extracted mediods to model glaucoma classes are represented by superimposing squares on each group of instances. Fig. 6b presents a set of possible normality ranges for each modeled pattern. Customized normality ranges identified for various mediods and the resultant normality regions for classes are depicted in Fig. 6c. Test sample is classified to the class if it lies within the normality region represented by one of its constituent mediod.

It is important to note that the proposed classifier doesn’t give a hard classification decision by assigning the feature vector representation of glaucoma disease to the class of majority of nearest mediods. The distance of test sample may be closer to one class but it may not fall in the normality range of that class due to denser distribution of samples around a particular mediod. However, the given feature space representation of glaucoma disease may still fall within the normality threshold of some other mediod with sparse distribution. Our proposed classifier handles this situation by checking the membership of test sample w.r.t mediods belonging to different classes and assigning the sample to the class for which the sample is falling within the normality range. This soft classification enables our proposed approach to handle overlapping complex-shaped class distributions with variable densities which is expected in problem at hand.

Experimental results

Material

The quantitative analysis of the proposed system is performed by using different publicly available databases and one local database for OD localization and Glaucoma detection methods. DRIVE database has 40 retinal fundus images of size 768 \(\times\) 584 [27]. The images were captured using Canon CR5 Non-Mydriatic retinal camera with a 45 degree Field of View (FOV). STARE database has 400 retinal images which are acquired using TopCon TRV-50 retinal camera with 35 degree FOV having size of 700 \(\times\) 605 [28]. DIARETDB (DIAbetic RETinopathy DataBase) is a database which is designed to evaluate automated lesion detection algorithms [29]. It contains 89 retinal images with different retinal abnormalities. The images are captured with a 50 degree FOV and a resolution of 1500 \(\times\) 1152. Digital Retinal Images for Optic Nerve Segmentation Database (DRIONS-DB) contains of 110 images belonging to the Ophthalmology Service at Miguel Servet Hospital, Saragossa Spain [30]. This database contains the ground truths for optic disc segmentation. Hamilton Eye Institute Macular Edema Dataset (HEI-MED) contains 169 fundus images [31]. This database is primarily designed for detection of exudates and macular edema. MESSIDOR is one of the large retinal image database having 1200 images of different diseases, varying dimensions, patients with different ages and ethnicity and at different stage of the disease [32]. HRF (High Resolution Fundus) is another fundus image database which contains 45 images in total [33]. This database contains annotations for vessels, optic disc and glaucoma etc.

A local database of 462 images has been gathered from local hospital. The images are captured using TopCon TRC 50EX camera with a resolution of 1504 \(\times\) 1000. A subset of this database containing 120 images is annotated from two ophthalmologists for glaucoma and named as glaucoma database (GlaucomaDB) [12]. A MATLAB based annotation tool has been used by the ophthalmologists for calculation of CDR and labeling of images as glaucoma or non glaucoma. The sole purpose of this database is to facilitate the researchers in automated detection of glaucoma. The database along with CDR values and glaucoma labeling is available online [12].

HRF is the only database which includes annotations for glaucoma whereas all other databases except STARE have been annotated with the help of two ophthalmologists. In case of MESSIDOR and HEI MED, only 100 and 50 images are annotated for glaucoma out of 1200 and 169 respectively. STARE database is used only to detect the accuracy of OD localization not for glaucoma. Table 2 shows the image level specifications of all databases.

Table 2 Image level description of each database for glaucoma

Results

The analysis of the proposed system is done for OD localization and glaucoma detection. The performance of proposed OD localization method is compared with already the proposed systems and the accuracies are reported in Table 3. The pictorial results for OD localization are given in Fig. 7.

Table 3 Performance comparison of OD localization with other existing techniques
Fig. 7
figure 7

Row1 candidate bright regions, Row2 extracted blood vessels, Row3 bounding box centered at bright region centroid and marked on segmented vessels showing vessel density calculation, Row4 OD localization and marked with cross on it

The detailed evaluation and analysis of the proposed glaucoma detection method is done by computing sensitivity, specificity, positive predictive value (PPV) and accuracy as shown by Eqs. 29, 30, 31, and 32 respectively.

$$\begin{aligned} Sensitivity = \frac{T_P}{T_P + F_N} \end{aligned}$$
(29)
$$\begin{aligned} Specifity = \frac{T_N}{T_N + F_P} \end{aligned}$$
(30)
$$\begin{aligned} PPV = \frac{T_P}{T_P + F_P} \end{aligned}$$
(31)
$$\begin{aligned} Accuracy = \frac{T_P + T_N}{T_P + T_N + F_F + F_N} \end{aligned}$$
(32)
  • \(T_P\) are true positives, meaning glaucomatous image is correctly classified.

  • \(T_N\) are true negatives, meaning non glaucoma images are correctly classified.

  • \(F_P\) are false positives, meaning non glaucoma images are wrongly classified as glaucomatous.

  • \(F_N\) are false negatives, meaning glaucomatous images are wrongly classified as non glaucoma.

In order to train and test the performance of the system, 70 % of the data is used for training and 30 % is used for testing by getting images of each type and from different databases so that the robustness and versatility of the proposed system can be checked. All databases are first normalized to same size and intensity ranges. The experiments are repeated 10 times in order to avoid bias. Table 4 shows the averaged performance results obtained by the proposed system for each database.

Table 4 Evaluation results of the proposed system for glaucoma detection

A quantitative comparison of proposed mediods based classifier is performed with other supervised classifier including k nearest neighbors (KNN), multilayered perceptron (MLP) with five hidden layers, Gaussian mixture model (GMM) with 9 mixtures and support vector machine (SVM) with radial basis function as kernel. The experiment has been conducted on newly proposed glaucoma database and the results are presented in Table 5.

Table 5 Comparison of proposed classifier with existing classifiers

 Figure 8 shows receiver operating characteristics (ROC) curves for four databases having a good number of glaucoma images. These curves have been generated for the proposed system with mediods based classifier.

Fig. 8
figure 8

ROC curves for glaucoma detection on different databases using the proposed system

Discussion

The proposed method has been tested on seven databases with different image sizes and intensity variation. These variations in the databases have been catered by performing normalization before all processing. Images from all the databases are normalized to a fixed resolution of 1000 \(\times\) 1600. In order to calculate the accuracy of OD localization method, a MATLAB based annotation tool is designed and OD centers are marked for all images with the help of an ophthalmologist. These OD centers are considered as ground truths and the distance of automatically detected OD centers are calculated from these ground truths. OD is considered as correctly detected if the difference between automated and ground truths centers is less than 10 pixels. Table 3 showed the comparison of the proposed system with existing OD localization techniques.

It is important to highlight here that all used databases except DRIVE and DRIONS contain images with bright lesions and acquisition noise. The results support the proposed method for OD localization. The Fig. 7 contains images with large number of lesions especially the bright lesions which complicates the localization of OD. The figure also contains an image with distorted vessels. In all these cases, simple intensity or vessel tracking based methods will not give desired results. However, the proposed method has correctly localized the OD in the presence of unwanted artifacts.

Glaucoma detection results are also presented in Table 5. As is obvious from the table, the proposed glaucoma detection using multivariate m-Mediods based approach outperforms competitors. The superior performance of the proposed approach is due to the fact that it can handle complex shape classes with multivariate distribution of samples as highlighted in Fig. 6. On the other hand, classifiers such as GMM can effectively handle only those patterns whose distribution is majorally Gaussian [34]. It is not effective in handling complex shape classes with tight and complex decision surfaces. SVM, on the other hand, handles patterns with complex decision boundaries. However, SVM has the tendency to tilt more towards classes with more number of training samples in its attempt to maximize the overall accuracy. This is not desirable in medical image analysis where we have only limited training data related to sensitive or extreme cases [34]. It is further noted that incorporating supervised feature enhancement using LFDA improves class separation thus helping different classifiers to improve their original classification capabilities. This fact is clear from the results given for LFDA-GMM and LFDA SVM in Table 5.

The proposed system has addressed four main limitations which are currently present in automated glaucoma detection (i) Robust algorithm for localization of OD in the presence of other pathologies so that region of interest can be extracted accurately for glaucoma detection (ii) A detailed feature set for true representation of glaucoma instead of just cup to disc ratio or rim to disc ratio (iii) An accurate classifier with capability of handling anomalies and multivariate distribution of samples within a class (iv) An annotated database solely designed to facilitate the research related to glaucoma detection. The proposed system has taken care of all these issues as main contributions of this article.

Conclusion

In this article, we have presented a new method for accurate detection of glaucoma from colored retinal images. The proposed system extracted optic disc using a novel method which analyzed vessel based features for accurate detection of optic disc. Once optic disc is detected, a region of interest is extracted for evaluation of optic disc properties. The proposed system extracted a number of features consisting of cup to disc ratio, rim to disc ratio, spatial and spectral features. A multivariate mediods based classifier is used for detection of glaucoma.

The testing of the proposed system has been conducted using publicly available databases and one locally gathered dataset with the help of Armed forces institute of ophthalmology (AFIO). A total of 554 images from different sources are used for proper testing and evaluation of the proposed system. The results have shown the validity of the proposed system. It is also supported by the results that proposed OD detection method and classifier outperformed the existing state of the art methods and classifiers.