1 Introduction

Diabetic Retinopathy (DR), a kind of eye disease caused by diabetes, has been a major cause of blindness for adults in the United States [3]. An effective way to evaluate the risk of DR and provide treatment in early stage is the Screen Program, however, a necessary requirement before automatic screening is to accurately locate the main anatomical structures, for example, optic disc (OD). The optic disc is an orange-pink cylindrical structure with a bright center and its location can serve as an indicator for the detection of other fundus structures, such as macula, considering that the distance between the optic disk and the macula is more or less constant [5, 21].

Many studies have been conducted on detection of OD in color fundus images. In the early stage, a lot of methods locate the optic disc based on its unique image characteristics. In Sekhar et al. [16], the brightest regions are regarded as the position of OD. Thus, they apply morphological operations to separate the brightest areas as candidate regions first, and then use Hough transform to detect the optic disk region among these candidate regions. In the method proposed by Mendon et al. [11], the features extracted from the vascular network from RGB retinal images are used, and the optic disk is located according to the entropy along vascular directions. Lu S et al. [8] assme that optic disk is a bright circular region and propose a method where a line operator is used to capture the circular brightness. However, these methods share a common limitation that some pathological structures like exudates, and imaging artifacts like haze, often produce same appearance as OD, thus leading to a wrong detection.

Some techniques employ the features of other anatomical structures that’s are closely related to the OD, such as macula and retinal vessel. In K Akita et al. [1], the optic disk is located by backtracking the vessels to their origin based on the assumption that the optic disk is the origin of blood vessels in retinal fundus images. F Mendels et al. [10], propose a morphological filtering technique that involves the use of active contours to localize the optic disk. In Adam et al. [7], a fuzzy convergence method is used to determine the origin of the blood vessel network. Compared with image characteristics, the features of other anatomical structures are more reliable considering the pathological structures and imaging noise. But there is still room for improvement in accuracy.

In this paper, we have proposed a robust approach to accurately detect the OD region and locate the OD center in color retinal images. Firstly, the proposed technique employs a kernelized least-squares classifier to decide the most likely optic disk region and segment the optic disk boundary. Then, a shifting vertical window that evaluates the connected-components and a shifting horizontal window that evaluates the illumination information are used together to find the convergence of blood vessels, which is thought to be optic disc center. Experimental results and analysis, which are tested on the two database DRIVE and NIVE, can show that our approach is more effective than existing methods in both boundary extraction and center localization of OD.

The main contributions of this paper are:

  • We propose an efficient kernelized least-squares classifier with high-dimensional feature space to locate the optic disk region quickly.

  • We propose a robust algorithm, which takes advantage of both image characteristics and anatomical structures, to detect the optic disk and its center accurately using convergence tracking of blood vessels.

  • We conduct experiments on two datasets and prove that the proposed algorithm outperforms state-of-the-art methods in terms of both accuracy and speed.

The remainder of this paper is organized as follows: Section 2 contains a detailed description of our proposed technique. Section 3 presents and analyzes the experimental results. Section 4 compares our method with other OD detection approaches. Finally, Section 5 concludes the paper.

2 Materials and methods

The proposed technique comprises two major steps: the detection of the optic disk region and the location of the optic disk center. A flow chart of our method is shown in Fig. 1. These steps are described in detail in the following subsections.

Fig. 1
figure 1

The flow chart of our method

2.1 Optic disc region detection

Our method is based on the CSK tracker [6], which is proved to be the highest speed among feature trackers [19]. To start with, two standard retinal fundus images corresponding to the left and the right eye are obtained as templates. The optic disk regions are manually labeled in these two templates. Then, a kernelized least-squares classifier, which has a merit of exploiting the binary circulant structure of the image patch, is used to detect the OD region. Due to this unique merit, the detection speed of our method is outstanding. Next stage is to train the classifier. All image patches in the vicinity of the optic disk regions are used as the training instances. The size of the patch is identical to that of the target in the template (assuming that the patch size is M × N, 1 ≤ m ≤ M, 1 ≤ n ≤ N). In our method, the size of the patch is 80 × 80. A Gaussian function, which has a symmetric bell curve shape and can minimize ringing in the Fourier domain [6], is used to label all these pixels in the patch. Hence, we use G(m, n) to denote the label information of pixel p m,n in the patch. The Gaussian function is defined by Eq. (1) below, where σ is the standard deviation of a normal distribution. In our proposed method, the value of σ is 0.2:

$$ G\left(m,n\right)=\frac{1}{2\pi {\sigma}^2}{e}^{-\left({m}^2+{n}^2\right)/\left(2{\sigma}^2\right)} $$
(1)

In our method, since RGB images with high-dimensional color attributes are used, we improve performance by allowing classification in a high-dimensional feature space. Hilbert space, which arises frequently in dimension-extending applications and has enough limits to allow the calculus to be used, is an ideal option in our algorithm. The mapping to the Hilbert space Φ is defined by Eq. (2):

$$ \Phi (t)=\frac{1}{\pi }{\displaystyle {\int}_{-\infty}^{\infty}\frac{\tau }{t-\tau }}d\tau $$
(2)

To train the classifier, we are supposed to find the weights w (defined in Eq. (5)), which can be determined when the cost c in Eq. (4) reaches its minimum . λ there is a regularization parameter and λ > 0:

$$ c={\displaystyle \sum_{i=1}^N{\left(\kappa \left(\Phi \left({p}_{m,n}\right),w\right)-G\left(m,n\right)\right)}^2}-\lambda {\left\Vert w\right\Vert}^2 $$
(3)

κ is the inner product defined by Eq. (4) below, where ā and \( \overline{b} \) are vectors, and l is the size of ā and \( \overline{b} \).

$$ \kappa \left(\overline{a},\overline{b}\right)={\displaystyle \sum_{i=1}^l{\overline{a}}_i{\overline{b}}_i} $$
(4)

The weights w is defined by Eq. (5):

$$ w={\displaystyle \sum_{m=0,n=0}^{M,N}a\left(m,n\right)\times \Phi \left(m,n\right)} $$
(5)

where coefficient a(m, n) is defined by Eq. (6), as

$$ a\left(m,n\right)={\mathrm{F}}^{-1}\left(\frac{\mathrm{F}\left(G\left(m,n\right)\right)}{\mathrm{F}\left(\kappa \left({p}_{m,n},p\right)\right)+\lambda}\right) $$
(6)

F is the two-dimensional Fourier Transform (FT) operator defined by Eq. (7). F− 1 is the inverse of the FT defined by Eq. (8), where u, v are the new independent variables of the Fourier function, and f(x, y) is a primitive function:

$$ F\left(u,v\right)=\mathrm{F}\left[f\left(x,y\right)\right]={\displaystyle {\int}_{-\infty}^{\infty}\kern0.1em {\displaystyle {\int}_{-\infty}^{\infty }f\left(x,y\right){e}^{-2\pi i\left(ux+vy\right)}}} dxdy $$
(7)
$$ f\left(x,y\right)={\mathrm{F}}^{-1}\left[F\left(u,v\right)\right]={\displaystyle {\int}_{-\infty}^{\infty}\kern0.1em {\displaystyle {\int}_{-\infty}^{\infty }F\left(u,v\right){e}^{2\pi i\left(ux+vy\right)}}} dudv $$
(8)

As the optic disk region in an image matches the templates to a great degree, we estimate the position of the optic disk region in the image by finding the image patch with the maximum detection score. The size of an image patch is the same with the definition above. For every image patch p in an image, the detection score is calculated by Eq. (9):

$$ S={F}^{-1}\left(F(a)\times F\left(\kappa \left(p,{S}_t\right)\right)\right) $$
(9)

Where S t is the detection score of the optic disk region in the templates. Other parameters are the same with the definition above. The optic disk region position in a fundus image is the location of the patch with the maximum detection score S. After deciding the patch that most likely contains OD region, an inscribed circle of the patch is considered to be the boundary of OD.

2.2 Optic disc center localization

Figures 2, 3 and 4 are retinal fundus images in different situations. Figure 2 shows an example of healthy retina. The optic disc appears as a bright circular area in the center of image and the blood vessel network converge at the center of OD. Figure 3 shows an example of glaucoma retina, where the circular shape and size are distorted. Figure 4 shows an example of abnormal retina with optic disc blurred, where OD boundary is nearly invisible. These cases indicate the difficulty to accurately detect OD so that it’s necessary to develop a robust algorithm for detection of various kinds of OD.

Fig. 2
figure 2

A healthy retina, the OD is shown to be identifiable considering both shape and color

Fig. 3
figure 3

Optic disc photo of retina with glaucoma

Fig. 4
figure 4

Abnormal retina, the optic disc boundary is blurred

However, by observing Figs. 2, 3 and 4, we can find a common visible property that the blood vessel networks all converge in OD center point. Therefore, we base our approach of OD detection on computing the coordinate of blood vessels convergence point. We divide the detection process into three steps: retinal vessel segmentation, determining the center abscissa, and determining the center ordinate.

2.2.1 Retinal vessel segmentation

Each color channel of the RGB images shows different information, for example, the luminance information is in the red channel, whereas blood vessel information is in the green channel. Hence, we choose the image in the green channel to separate blood vessels from the optic disk region. The images of each channel are shown in Fig. 5.

Fig. 5
figure 5

The figures in different channels. The left figure represents the R channel, the middle figure represents the G channel, and the right figure represents the B channel

As we have detected the optic disk region above, we only need to detect the optic disk center in the range of the optic disk region instead of the overall image, which saves the detection time in our method. The median filtering and edge enhancement are performed on the optic disk region first. And then we employ binarization with a grayscale threshold to separate the blood vessels from the image. In our method, the value of the threshold of the blood vessels in binarization is set to be 150.

2.2.2 Determining abscissa of optic disk center

Noting that vessel-connected components are concentrated around optic disk center, while the connected components in other places of the retina are diffused, we can take advantage of this feature to detect OD center. Besides, since the thick retinal vessels are usually approximately perpendicular to the OD, we can compute abscissa of the OD center by designing a vertical rectangular window to traverse the optic disk region from left to right. The width of the rectangular window is twice that of the blood vessels, and the height is identical to that of the optic disk region. The blood vessels are divided into vessel-connected components in a vertical rectangular window. D(v) is defined in Eq. (10) to quantify the degree of blood vessel concentration:

$$ D(v)=\left(-1\right)\times \left({\displaystyle \sum_{i=1}^{N_w}\left(\frac{V_i}{V}\times \log \left(\frac{V_i}{V}\right)\right)}\right) $$
(10)

where D(v) represents the concentration of blood vessels in the v-th window, N w is the total number of windows, V i is the pixel number of the i-th vessel-connected component, and V is the total number of pixels of blood vessels in the vertical rectangular window v. When the value of \( \frac{V_i}{V} \) is small, \( \log \left(\frac{V_i}{V}\right) \) would be large negative number, and as a result, D(v) would be large positive. Hence, the varying distribution of blood vessels will contribute to significant disparity. The center of the optic disk is considered to be on the central vertical line of the rectangular window that minimizes the concentration value D(v), which indicates a relatively large vessel-connected component. Thus, the abscissa is confirmed in this step. Figure 6 shows the relation diagram between D(v) and the rectangular window.

Fig. 6
figure 6

Relation diagram between D(v) and the rectangular window at different locations. The blue, red and black windows in the right figure correspond to the 1st, 4th and 7th bin in the left figure

2.2.3 Determining ordinate of optic disk center

As the optic disk is significantly brighter than the surrounding pixels and the value of the Gabor filter in the optic disk is greater than that in the surrounding pixels, in the third stage, we will locate the ordinate of ordinate of OD center with the use of luminance information and Gabor filter. First of all, mean filtering is normally employed to remove the uneven illumination noise and imaging artifacts. Subsequently, we design a horizontal rectangular window, whose width is identical to that of the OD region and height approximately equals to the retinal vessels width, to traverse the optic disk region from top to bottom. The value R(h) defined in Eq. (11) is used to measure the intensity and the value of the Gabor filter in the h-th horizontal rectangular window, where (x i , y i ) (1 ≤ i ≤ N p , N p is the total number of pixels in the window) is the coordinate of the pixels in the horizontal rectangular window h, l(x i , y i ) represents the RGB values of the pixels at (x i , y i ):

$$ R(h)=\frac{{\displaystyle \sum_{i=1}^{N_p}\left(l\left({x}_i,{y}_i\right)\times g\left({x}_i,{y}_i\right)\right)}}{N_p} $$
(11)

Where g(x i , y i ) is the mean value of the Gabor filter at (x i , y i ), defined by Eq. (12) below. δ is the standard deviation of the Gaussian factor in pixels in the h-th window, and f is a constant coefficient. The value of f in our method is fixed as 0.2.

$$ g\left({x}_i,{y}_i\right)=\frac{f^2}{\pi \times {\delta}^2}\times {e}^{\left(-{f}^2\times \frac{{\left({x}_i\prime \right)}^2+{\left({y}_i\prime \right)}^2}{\delta^2}\right)}\times {e}^{\left(2\pi \times f\times {x}_i\prime \right)} $$
(12)

x i  ′ and y i  ′ are defined separately by Eqs. (13) and (14), respectively, where θ is the orientation in radians of the Gabor filter. In this paper, the value of θ is set to be π/2 so as to acquire a horizontal kernel.

$$ {x}_i^{\prime }={x}_i\times \cos \theta +{y}_i\times \sin \theta $$
(13)
$$ {y}_i^{\prime }=-{x}_i\times \sin \theta +{y}_i\times \cos \theta $$
(14)

The rectangular window with the maximum R(h) value is regarded to be the one where the center is located. We choose the ordinate of the central horizontal line of the rectangular window as the ordinate of the optic disk center. The position of the center achieved above is then mapped to the original retinal fundus image. Figure 7 shows the relation diagram between R(h) and the rectangular window.

Fig. 7
figure 7

Relation diagram between R(h) and the rectangular window at different locations. The blue, red and black windows in the right figure correspond to the 1st, 4th and 7th bin in the left figure

Through the work above, we are able to acquire a vertical window v i which makes D(v) minimum, and a horizontal window h j which makes R(h) maximum. Then the optic disk center can be determined as the intersection point of the central vertical line in v i and the central horizontal line of h j .

3 Experiments and results

3.1 Datasets

Experiments are conducted on two databases: the Digital Retinal Images for Vessel Extraction (DRIVE) [18], and the Non-fluorescein Images for Vessel Extraction (NIVE) [12] datasets.

DRIVE: This dataset contains 40 color fundus images, including 10 unhealthy retinal images and 30 healthy images. All images are captured by a Canon CR5 non-mydriatic 3CCD camera at a 45-degree field of view. Each image is 768584 pixels.

NIVE: The dataset privately belongs to us. All images are provided by Department of Ophthalmology of Shanghai Jiao Tong University Affiliated Sixth People’s Hospital. The dataset contains 400 color fundus images, including 150 unhealthy retinal images and 250 healthy images. All images are captured by a Canon CR-DGi non-mydriatic retinal camera at a 45-degree field of view. Each image was 19361288 pixels.

3.2 Experiments

We evaluate our proposed method on the two datasets presented above: DRIVE and NIVE. The manual segmentation results are available from the datasets. For DRIVE, the manual results are labeled by three observers trained by a seasoned ophthalmologist. For NIVE, the manual results are labeled by three experts from the Six People’s Hospital, affiliated with Shanghai Jiao Tong University. We assume that the manually labeled images are exactly correct and define the manual detection as the ground-truth in our test. Examples of the detection process are shown in Fig. 8. All experiments are performed in MATLAB 2012b on a computer (3.20GHz Intel Core i5-4460 CPU, 8GB RAM).

Fig. 8
figure 8

The detection process of our method. (a) Original retinal fundus images. (b) Optic disk region images obtained in the first step. (c) Optic disk region images with the blood vessels segmented. (d) Optic disk region images with the center located. (e) Final images with the OD center detected and OD boundary marked

3.2.1 Optic disc region detection

We use three common evaluation metrics to assess our proposed technique. The three measures are precision, recall rate, and F1 score. Precision(PR) expresses the proportion of correctly detected optic disk pixels to all optic disk pixels detected by our algorithm. Recall rate(RE) indicates the proportion of correctly detected optic disk pixels to all optic disk pixels in ground-truth. The F1 score(F1) reflects the trade-off between precision and recall rate. These evaluation metrics are defined by Eqs. (15), (16) and (17), respectively:

$$ PR=\frac{tp}{tp+fp} $$
(15)
$$ RE=\frac{tp}{tp+fn} $$
(16)
$$ F1=2\times \frac{PR\times RE}{PR+RE} $$
(17)

In the equations above, tp represents the number of optic disk pixels identified correctly by our algorithm, fp represents the number of pixels that are identified as optic disk pixels but are actually not in ground-truth, and fn represents the number of real optic disk pixels that are not identified by our algorithm. The closer the values of these metrics approximate to 1, the more precise is the detection performance of our proposed method. If the value of these metrics is 0.5, this means that our algorithm is equivalent to a pure random guess; if the value of these metrics is 1.0, it means that the detection result of our method is completely consistent with the manual labeled results.

Furthermore, we use other additional evaluation metrics to evaluate the performances of optic disk region detection: the center location error(CLE), distance precision(DP) and overlap precision(OP). Specifically CLE measures the average Euclidean distance of the geometric center location between the region detected by our method and the ground-truth. DP denotes the number of images where CLE of the location region was smaller than a certain threshold. OP is the percentage of images where the detection region overlap exceed a threshold compared with the region of the ground-truth.

Figure 9 shows the line charts containing distance precision(DP) and the overlap precision(OP). From Fig. 9, we learn that when the location error threshold(CLE) is larger than 6, the value of DP is close to 1. When the overlap threshold is smaller than 0.92, the value of OP is close to 1. Table 1 lists the evaluation metric and the detection times for different datasets used in our method. Figure 10 shows a comparison between our method and manual detection. From Table 1, we see that almost all evaluation metric results are greater than 94 %, and the mean detection time per image is shorter than 0.2. All these indicate the high efficiency of our method.

Fig. 9
figure 9

Distance precision and overlap precision of our method. DRIVE_H contains all healthy retinal images in dataset DRIVE. DRIVE_U contains all unhealthy retinal images in dataset DRIVE. NIVE_H contains all healthy retinal images in dataset NIVE, and NIVE_U contains all unhealthy retinal images in dataset NIVE

Table 1 The evaluation metrics of datasets tested for the detection of the optic disk region
Fig. 10
figure 10

A comparison between our method and the manual detection of the optic disk region. The manual results are provided by doctors from the Six People’s Hospital, affiliated with Shanghai Jiao Tong University. The optic disk region is marked with a circle

3.2.2 Optic disc center localization

We use Geometric Dilution Precision(GDOP) and success rate(SU) to assess our center location algorithm. GDOP is a coefficient that measures the accuracy of detection, and it represents the distance between the actual position of the optic disk center and that determined by our method. SU is the ratio of the number of images with optic disk center detected correctly to the total number of images. GDOP and SU are defined by Eqs. (18) and (19):

$$ SU=\frac{Ns}{Ns+Nf} $$
(18)
$$ GDOP=\sqrt{\sigma_{cx}^2+{\sigma}_{cy}^2} $$
(19)

where Ns is the number of images where the optic disk center is detected correctly, Nf is the number of images located incorrectly, σ 2 cx is the variance in the direction of the x-axis, and σ 2 cy is the variance in the direction of the y-axis. σ 2 x and σ 2 y are defined by Eqs. (20) and (21):

$$ {\sigma}_x^2=\frac{1}{N_I}{\displaystyle \sum_{i=1}^{N_I}{\left(c{x}_i-c{x}_i\prime \right)}^2} $$
(20)
$$ {\sigma}_y^2=\frac{1}{N_I}{\displaystyle \sum_{i=1}^{N_I}{\left(c{y}_i-c{y}_i\prime \right)}^2} $$
(21)

where (cx i , cy i ) represents the optic disk center coordinates of image i labeled manually. (cx i  ′, cy i  ′) are the optic disk center coordinates of image i detected by our method, and N I is the total number of images tested. In this evaluation metric, a lower value of GDOP represents better accuracy.

Figure 11 shows the variation in the values of D(v) and R(h) with the change of rectangular window position. The curves in different colors represent different sizes of the rectangular window. When the width of the window is twice that of the blood vessels, the valley of the curve is the most obvious; when the height of the window is equal to the blood vessel width, the peak of the curve is the most obvious. Hence, we choose twice the width of the blood vessels as the width of the vertical rectangular window when determining the abscissa of the center, and choose the width of blood vessels as the height of the horizontal rectangular window when determining the ordinate.

Fig. 11
figure 11

The changes in D(v) and R(h) with the position of the rectangular window. The curve in different color represents different sizes of the rectangular window. The blue curve indicates a window size of four times that of the blood vessels. The red curve indicates a window size of thrice that of the blood vessels. The green curve indicates a window size of twice that of blood vessels. The purple curve indicates a window size equal to that of blood vessels

Table 2 lists the GDOP value and the detection time of our method on different datasets. Figure 12 is a scatter plot showing the deviation in the detection of the center, and Fig. 13 shows a comparison between our method and manual detection. From Table 2, we learn that all GDOP values are less than 2.5 pixels and the detection time is less than 1.5 s. From Fig. 12, we learn that the deviation between our detection and ground truth is small(the maximum is 10.8, the minimum is 0). Moreover, we have invited 10 qualified doctors to evaluate the quality of the results obtained by our method. As summarized in Table 3, the evaluation results of our approach have outperformed other existing methods, which demonstrate that the detection using our method is accurate, and the short detection time shows that it is efficient.

Table 2 The evaluation values for datasets tested for the detection of the optic disk center
Fig. 12
figure 12

Scatter plot showing the deviation in OD center detection. Data represents part images from the dataset NIVE

Fig. 13
figure 13

A comparison between our method and manual detection in the detection of the optic disk center. Manual results are provided by doctors from the Six People's Hospital, affiliated with Shanghai Jiao Tong University. The optic disk center is marked with a green spot (our method) and a black spot (manual detection)

Table 3 The evaluation results by assessors for the detection of the optic disk

3.3 Assessor evaluation

We have invited 10 hospital doctors to evaluate the quality of results obtained by our method. The results are the percentage of images favored by the doctors. Table 3 shows the quantitative values of the doctors’ evaluation of the detection results obtained by our method. Figure 14 analyzes the evaluation values for different datasets. The mean accuracy is greater than 97 %, which confirms that our method is favored by doctors.

Fig. 14
figure 14

Assessor’ evaluation of our method. The black lines in boxes denote the median, the triangles represent the mean values, and the black lines outside the boxes represent the maximum and the minimum values

4 Discussion

To emphasize the effectiveness of our method, we choose a few recently proposed supervised methods, test them on the NIVE dataset, and use the evaluation metrics mentioned above to assess their performance. Table 4 shows the evaluation metrics obtained from these methods. It can be observed that our method owns higher SR and smaller GDOP value compared with other methods, which proves that our method is faster and more precise than the prevalent techniques. Moreover, we have invited qualified experts to evaluate the quality of the experimental results and we see that the results obtained by our method are more recognized by the assessors, as summarized in Table 4. Figure 15 shows the evaluation results for different methods by assessors. The correct rate of our method is higher than those of other methods. Based on the data in Table 4 and Fig. 15, we conclude that the detection results of our method are more precise than those obtained by other methods.

Table 4 The comparison between our method and other methods tested on the NIVE dataset
Fig. 15
figure 15

A comparison of assessor evaluation between ours and other methods. The black lines in boxes denote the median, the triangles represent the mean values, the black lines outside the boxes represent the maximum and the minimum values

5 Conclusions

An effective and efficient approach to accurately detect the optic disc in color retinal images is presented in this paper. The proposed technique first employs a kernelized least-squares classifier to determine the OD region and then locate the OD center through evaluation of both vessel-connected component and luminance information. Fundus images from two datasets, DRIVE and NIVE, are used to test the robustness of the proposed method. By analyzing the results, we can draw the conclusion that the proposed method outperforms other optic disk detection methods with a competitive accuracy (97.52 %) and efficiency (0.1761 + 0.9816 = 1.1577s). Moreover, the proposed method can not only be used in the OD detection, but also be applied to other image segmentation problems, especially in medical image processing field. In the future, we intend to explore the implicit relations between OD and other anatomical structures, such as macula and retinal vessels.