Introduction

Assertive and early diagnosis of diseases has always been a great concern to public health (SBD 2018). The benefits are diverse, such as cost reduction of medical treatments and early detection of symptoms of potentially irreversible damages. The medical examinations present the ongoing evolution dynamics aligned with the development of acquisition techniques, processing, and image analysis. In the case of computerized exams, image processing is a key factor that allows emphasizing structures of interest for the identification of potential anomalies (Singh and Kaur 2015).

Currently, several researches have been focusing on digital processing of retinal imaging, aiming to identify and analyze various diseases such as diabetic retinopathy, glaucoma, macular degeneration, atherosclerosis, hypertension, and cardiovascular diseases (Singh and Kaur 2015; Zhu et al. 2017). Figure 1 shows images of a healthy retina and a retina with signs of diabetic retinopathy, both from the Structured Analysis of the Retina (STARE) database.

Fig. 1
figure 1

Fundus images. a Healthy retina (Image #82). b Presence of diabetic retinopathy (Image #139)

Usually, the blood vessels in the retinal images are manually marked, leading to an increase in the time required for analysis and in financial costs. Supervised or unsupervised image methods intend to obtain automatic vessel segmentation and facilitate the marking process, where one of the main challenges is to improve the low contrast of vessels in relation to the background. Hereafter, the main works on unsupervised methods, which is the same category of this proposal, will be summarized. With few exceptions, all works made use of Digital Retinal Images for Vessel Extraction (DRIVE) and STARE databases.

Nugroho et al. (2018) applied the 2D Gabor wavelet transform (2D-GWT) and morphological operations to fundus images to segment the retinal blood vessels. The pre-processing step consists in extracting the green channel, followed by the complement operation and contrast limited adaptive histogram equalization (CLAHE) stages. In the segmentation step, the 2D-GWT reduces the noise and improves the vascular pattern. The closing operation connects the image points improperly disconnected in the previous processes. Eventually, the morphological reconstruction approaches the object edges through successive operations based on morphological dilation and connectivity.

Neto et al. (2017) segmented the retinal vessels adopting a coarse-to-fine approach, combining Gaussian smoothing, top-hat morphological operator, and contrast enhancement for vessel homogenization and noise reduction. Based on statistics of spatial dependence and probability, the authors present an approximation for the thicker vessels map with a local adaptive threshold. Analyses of curvature and morphological reconstruction refine the segmentation.

Fan et al. (2019) used a hierarchical strategy integrated in the image matting model for blood vessel segmentation based on a trimap created from the characteristics present in the images, separating them in vessels, background, and unknown regions. Then, a hierarchical image matting model defines the pixels of unknown regions as vessel or background.

Sazak et al. (2019) introduced a method called bowler-hat transform to enhance blood vessels in the retina. Based on mathematical morphology, this method combines different structuring elements and minimizes non-uniform illuminating effects, resulting in the preservation of vessel junctions and better detection of fine vessels.

The following works use methods classified as supervised.

Roychowdhury et al. (2015) devised a vessel segmentation method for retinas with abnormalities due to diabetic retinopathy. Initially, a high pass filter and an operation of morphological reconstruction generate two binary images using the green channel. Hence, the pixels classified as vessels in both images will constitute the major vessels (larger vessels). Subsequently, a Gaussian mixture model (GMM) treats the remaining pixels to match them back to the major vessels.

Liskowski and Krawiec (2016) proposed the segmentation of retinal vessels using deep neural networks. The main characteristics in this study are spatial arrangement, local connectivity, parameter sharing, and grouping of hidden units.

In Zhu et al. (2017), a method based on the extreme learning machine (ELM) provides the retinal vascular pattern from a training vector with 39 characteristics for each pixel. These characteristics include the pixel intensity, the result of 2D Gaussian filtering and its derivatives, morphological operations, and top-hat and bottom-hat transformations.

Considering the significance of retinal images in diagnosing various diseases, this work deals with the application of digital processing techniques for contrast enhancement, noise filtering, and automatic retinal vessel segmentation in fundus images. The work was developed in a MatLab®-Version 2017a environment using training and test images obtained from public databases DRIVE (Staal et al. 2004), STARE (Hoover et al. 2000) and HRF (Budai et al. 2013).

Methods

As shown in Fig. 2, this section presents the proposed structure for the contrast enhancement and retinal blood vessels segmentation. Based on Nugroho et al. (2018), this new proposal includes the edge and mask detection using the red channel, retinal edge removal, and suppression of vessels background to remove the retinal edge, preserve vessel junctions, and reduce the noise level. As will be shown, these modifications permitted working better with DRIVE, STARE, and HRF databases.

Fig. 2
figure 2

Flowchart of the proposed method. Pre-processing stage: green and red channels extractions, edge and mask detections, complement operation, retinal edge removal, CLAHE, and suppression of vessels background. Segmentation stage: 2D Gabor wavelet transform and closing operation and morphological reconstruction

Databases

The tests and validation were performed in three public databases, commonly used in most related works. These are DRIVE (digital retinal images for vessel extraction) from Staal et al. (2004), STARE (structured analysis of the retina) from Hoover et al. (2000) and HRF (high-resolution fundus image database), available on www.isi.uu.nl/Research/Databases/DRIVE/, http://cecas.clemson.edu/~ahoover/stare, and https://www5.cs.fau.de/research/data/fundus-images/, respectively.

The DRIVE database provides 40 color fundus images of 584 × 565 pixels size, captured by a Canon CR5 non-mydriatic 3CCD camera with a 45-degree field of view (FOV), in which 7 images present signs of mild early diabetic retinopathy. The 40 images are available in training and test sets, both containing 20 images (Staal et al. 2004).

The STARE database consists of 20 color fundus images of 700 × 605 pixels size, captured by a TopCon TRV-50 fundus camera with a 35-degree FOV. The diagnostics list and the expert annotations of manifestations visible in the images are also available.

The HRF database provides 45 high-resolution color fundus images of 3504 × 2336 pixels size, divided in 3 sets with 15 images of healthy patients, 15 images of patients with diabetic retinopathy, and 15 images presenting with glaucomatous.

All databases supply images with manual medical markings, named ground truth images, which allow performing objective validations.

Pre-processing

The pre-processing techniques, indicated in Fig. 3, aim at image enhancement, suppressing unwanted distortions, and highlighting important characteristics for the segmentation process. Figure 3 shows the fundus image #39 (Fig. 3a), available in the DRIVE database, and the resulting images in the pre-processing stages (Fig. 3b–f).

Fig. 3
figure 3

Step-by-step of pre-processing. a Original RGB image. b Green channel extraction. c Complement operation. d Retinal edge removal. e CLAHE. f Suppression of vessels background by vessel enhancement 2D

Green channel extraction

The original images are available in the RGB color space where the pixels intensity can vary from 0 to 255 (8 bits) for each one of these colors. The green channel (Fig. 3b) is the RGB component with the highest contrast between the blood vessels and the background, the reason why this channel is more adequate for the analysis of fundus images.

Complement operation

The complement operation (Fig. 3c) consists of subtracting 255 from the gray level values in the green channel image. The result represents the inversion of image intensity levels, making the blood vessels brighter than the retinal background. The complement of an image A is:

$$ {A}^c=\left\{\left(x,y,K-z\Big)|\Big(x,y,z\right)\in A\right\} $$
(1)

where x and y are pixel coordinates, K = 2l − 1 and l is the number of bits used to represent the intensity z (Gonzalez and Woods 2018).

Edge and mask detections for retinal edge removal

A mask for the region of interest (ROI) is created from the red channel using the Sobel operator and mathematical morphology. As the ROI is not completely circular in some databases, the images must be overlapped on a black rectangle with larger dimensions than the original image to ensure the mask edge detection by the Sobel operator.

As edges are regions characterized for abrupt variations in pixel intensity, they present a high spatial gradient. Sobel operator calculates the 2D spatial gradient in the image and emphasizes the pixels with high spatial gradient (Gupta and Mazumdar 2013). For such, it used a pair of 3 × 3 convolution matrices that slide over the image on the x and y axes, resulting in their respective gradients of magnitude, Gx and Gy. The gradient magnitude Gy is calculated by:

$$ \mathrm{Gy}=\left[\begin{array}{ccc}1& 2& 1\\ {}0& 0& 0\\ {}-1& -2& -1\end{array}\right]\mathrm{I} $$
(2)

where I is the image. Gx can be obtained in an analogous way, taking the transposed matrix used for calculating Gy. The gradient magnitude |G| is:

$$ \left|G\right|=\sqrt{G{x}^2+G{y}^2} $$
(3)

Subsequently, the magnitude image is binarized and dilated using a diamond structuring element (SE) with size 2 to eliminate possible gaps in the detected edge. Then, the mask center is filled with binary pixel of true value (1 or white) and smoothed by erosion by a diamond SE with size 7, whose size is adjusted to reduce the white area, keeping only the ROI. The dilation and erosion of an image A by a flat SE B, denoted by AB and AB, respectively, are (Gonzalez and Woods 2018):

$$ \mathrm{Dilation}:A\oplus B=\left\{z|\left[{\left(\hat{B}\right)}_{\mathrm{z}}\cap A\right]\subseteq A\right\} $$
(4a)
$$ \mathrm{Erosion}:\mathrm{A}\ominus \mathrm{B}=\left\{\mathrm{z}|{\left(\mathrm{B}\right)}_{\mathrm{z}}\subseteq \mathrm{A}\right\}. $$
(4b)

Finally, there are the removal of borders and the application of a mask to the resulting image from the complement operation, as shown in Fig. 3d. The mask has the same original image dimensions.

CLAHE

The CLAHE is widely used in medical image processing. The CLAHE algorithm divides the image into rectangular regions, applying some threshold and equalization locally in each region. After setting a threshold for the gray levels, occurrences above this threshold are clipped to minimize saturation, followed by uniform and recursive redistributions along the local histogram. As a result, the background levels become more flattened, increasing the background–vessels contrast (Pizer et al. 1990; Zuiderveld 1994; Nugroho et al. 2018).

Based on the number of regions experimentally determined by Zuiderveld (1994), the images were divided into 32 rows and 32 columns of tiles. The chosen distribution for the histogram was the bell shape, also called Rayleigh, and the value for the clip limit was set to 0.02 within a 0–1 range by successive adjustments, allowing an adequate contrast enhancement with an acceptable noise level for the next steps. Figure 3e shows the enhanced image.

This step offers two different and independent techniques as options to reduce the noise and effects of non-uniform illumination and brightness, preserving vessel junctions and small vessels:

  1. 1)

    Opening operation: according to Gonzalez and Woods (2018), the opening operation suppresses the brighter details that are smaller than the structuring element (SE). For the opening operation, the non-flat SE was defined as a ball, as proposed by Nugroho et al. (2018). The radius (r) and height (h) defined for the SE through the offsetstrel MatLab® function were determined experimentally as 2 and 140, respectively, aiming for the smallest SE and average grayscale values. The increase of h makes the image background uniform, but eliminates thinner blood vessels.

  2. 2)

    Vessel enhancement 2D: Sazak et al. (2019) developed a method based on Zana and Klein (2001) to enhance elongated vessel-like structures in biomedical images. It carries out a series of morphological openings with a line-shaped SE across defined angles. According to Sazak et al. (2019), the SE width is 1 pixel and its length is the biggest vessel expected diameter. For each angle, segments smaller than the SE are removed, whereas the bigger ones remain unchanged. The final enhanced image is a pixel-wise maximum among all images produced in all considered orientations given in the following:

$$ {I}_{\mathrm{out}}={\mathit{\max}}_{\uptheta}\left\{I\circ {b}_{\theta }:\forall \theta \in \left[0,\frac{180}{n},\dots, 180-\frac{180}{n}\right]\right\} $$
(5)

where Iout is the resulting image, I is the input image, the symbols “∘” and bθ indicate the opening operation and the SE respectively, and θ are the angles in the n selected orientations. In this work, following the suggestions of Sazak et al. (2019), the length of SE was set to 10 and the number of orientations n was set to 12.

Aiming to work with high-resolution images such as the ones available at HRF database, this proposal included an automatic adjustment for sizes of SEs used in the pre-processing stage. This adjustment is based on the size ratio between the DRIVE training and the test images, taking the number of columns or rows of those images into account.

Segmentation

The segmentation aims to extract the blood vessels from the retinal background. Figure 4 shows the resulting images for each segmentation step, consisting of 2D-GWT (Fig. 4a), a closing operation (Fig. 4b), and final image, after the morphological reconstruction (Fig. 4e). Figure 4f shows the corresponding ground truth image from DRIVE. Figure 4c and d highlights the same regions from the results of 2D-GWT and closing operation, respectively.

Fig. 4
figure 4

Segmentation process: a 2D-GWT, b closing operation, c and d highlighted vessels after 2D-GWT and closing operation, respectively, e final segmented image, after morphological reconstruction operation, and f ground truth

2D Gabor wavelet transform

According to Jain and Farrokhnia (1990), a 2D Gabor function corresponds to a sinusoidal plane wave, with certain frequency and orientation, modulated by a two-dimensional Gaussian envelope, where the image information is extracted by measuring the energy in small windows. Blood vessel extraction by 2D-GWT (Fig. 4a) is due to its ability to provide spatial information, orientation selectivity, and spectral characteristics, which makes it possible to detect directional structures and high frequency regions, such as blood vessel edges. The 2D-GWT is defined as (Soares et al. 2006):

$$ {\psi}_{\mathrm{G}}(x)=\exp \left(j{k}_0x\right)\exp \left(-\frac{1}{2}{\left| Ax\right|}^2\right) $$
(6)

where \( A=\mathit{\operatorname{diag}}\left[{\epsilon}^{-\frac{1}{2}},1\right] \) is a 2 × 2 diagonal matrix and k0 is a vector. ϵ ≥ 1 indicates the filter anisotropy, i.e., its elongation in any direction, and k0 is the complex frequency.

In this work, the orientation starts from 0 to 165o, in steps of 15o, resulting in 12 directions in which the vessel characteristics are extracted. According to Soares et al. (2006), ϵ was set to 4 and the frequency was set to 4 for both databases.

Closing operation

The closing operation (Fig. 4b) suppresses dark details smaller than the SE. This operation allows fulfilling the central parts of larger vessels not fully detected by the 2D-GWT due to its distance from the vessel edges (low frequency region) as well as to link up disconnected vessels especially at junctions. The closing of A by SE B, denoted A ¥ B, is:

$$ A\ \yen\ B=\left(A\oplus B\right)\ominus B $$
(7)

which indicates that the closing of A by B is the dilatation (⊕) of A by B followed by the erosion (⊖) of this resulting image by B (Gonzalez and Woods 2018).

The SE was defined as flat diamond (Nugroho et al. 2018) and the distance between the origin and the edge was adjusted to 2, the minimum value available, aiming to fill the larger vessels. Values greater than 2 made the vessels wider and linked different vessels at junctions, increasing the number of pixels wrongly detected as vessels. To illustrate the filling of one vessel after the closing operation, Fig. 4c and d presents pixel details resulting from 2D-GWT and closing operation, respectively.

Morphological reconstruction

The morphological reconstruction involves two images, named marker and mask, and an SE. The marker contains the starting points for the transformation after which the image will suffer successive dilations until it reaches the mask. The mask function is to limit the marker transformation and the SE defines the connectivity. The morphological reconstruction by geodesic dilation of size n, denoted by \( {\mathrm{D}}_{\mathrm{G}}^{\left(\mathrm{n}\right)}\left(\mathrm{F}\right) \), is defined as (Gonzalez and Woods 2018):

$$ {\mathrm{D}}_{\mathrm{G}}^{\left(\mathrm{n}\right)}\left(\mathrm{F}\right)={\mathrm{D}}_{\mathrm{G}}^{(1)}\left[{\mathrm{D}}_{\mathrm{G}}^{\left(\mathrm{n}-1\right)}\left(\mathrm{F}\right)\right],\mathrm{With}\ {\mathrm{D}}_{\mathrm{G}}^{(1)}\left(\mathrm{F}\right)=\left(\mathrm{F}\oplus \mathrm{B}\right)\cap \mathrm{G} $$
(8)

where G, F, and B are the mask, marker, and the SE, respectively.

The final image (Fig. 4e) is the result from morphological reconstruction. The resultant image from the closing operation makes up the mask while the 2D-GWT image functions as the marker. Fig. 4f shows the respective ground truth image marked by a specialist, provided by DRIVE database.

The integrated computational interface

The algorithm was implemented in MatLab® environment and a graphical user interface named “Retinal Lab-A Tool for Fundus Image Analysis” was developed as an application for monitoring images through pre-processing, contrast enhancement, and segmentation steps, with parameter adjustments as well as performance indexes. This interface is available at https://github.com/DouglasAbreu/RetinalLab.

On an Intel Core I5 and 4GB RAM computer, the full processing of an image in MatLab® environment took an average time of 2.6 s for DRIVE and STARE databases and 58.67 s for HRF, involving all processing stages. Figure 5 shows the developed graphical user interface, whose operation begins from selecting an image stored in the same computational environment.

Fig. 5
figure 5

Stand-alone application retinal lab developed in MatLab® environment

Results

The validation included comparative analyses using objective metrics usually applied in medical image processing and content validity tests performed by ophthalmologists and retinal specialists using the “Retinal Lab-A Tool for Fundus Image Analysis.”

Objective validation

For objective evaluation, the tests used the 20 remaining images from the DRIVE and all images available in the STARE and HRF databases. After submitting the selected image to all proposed steps, the segmented image was then compared with the respective ground truth image, available in the databases.

The validation criteria according to the sensitivity, specificity, accuracy, and balanced-accuracy are (Nugroho et al. 2018; Neto et al. 2017):

$$ \mathrm{Sensitivity}: Se=\frac{TP}{TP+ FN}\times 100\% $$
(9)
$$ \mathrm{Specificity}: Sp=\frac{TN}{TN+ FP}\times 100\% $$
(10)
$$ \mathrm{Accuracy}: Acc=\frac{TP+ TN}{TP+ FP+ TN+ FN}\times 100\% $$
(11)
$$ \mathrm{Balanced}-\mathrm{Accuracy}: BAcc=\left(\alpha\ Se+\beta\ Sp\right)\times 100\% $$
(12)

TP (true positive) and TN (True Negative) are the numbers of pixels correctly detected as vessels and as background, respectively, while FP (false positive) and FN (false negative) indicate the numbers of pixels wrongly detected as vessels and as fundus, respectively. In Eq. 12, the weights α and β were adjusted to 0.5 to equally balance the sensitivity and specificity (Neto et al. 2017).

Aiming to validate this proposal, the obtained results were compared with the main works in available literature, separated into supervised and unsupervised classes, since they involve different methodological approaches. Table 1 shows the average results achieved, based on images from DRIVE, STARE, and HRF, highlighting in bold the highest rates for each category and database.

Table 1 Comparison between the proposed method and related works

Subjective validation

The accomplishment of subjective validation involved eleven ophthalmology professionals, including four retinal specialists. Aiming to evaluate the image quality after the 5 main test phases, as well as the system overall quality, the doctors assessed the resultant images of each processing phase using the retinal lab interface and images on DRIVE and STARE databases, randomly selected by the participants. The score tests consisted of discrete grading scales between 1 and 20, encompassing 1–4 (bad), 5–8 (poor), 9–12 (reasonable), 13–16 (good), and 17–20 (excellent), subsequently converted to a 1–100 continuous range. The statistical analysis was performed in R software environment.

Considering the 5 questions in the tests, the answers produced 55 scores while the overall quality was inferred from the scores attributed to the steps previously evaluated. Figure 6 shows typical boxplots for the test results, with the exclusion of one participant (outlier) that did not complete all test steps. The results show the evaluation of image quality after the green channel extraction (median score 82.5), CLAHE (median score 90), suppression of vessels background (median score 80), highlighting of vessels (median score 72.5), removal of details (median score 77.5), and overall quality (median score 75.5).

Fig. 6
figure 6

Graphical results showing test responses and system overall quality, with Q1-green channel extraction, Q2-CLAHE, Q3-suppression of vessels background, Q4-highlighting of vessels, Q5-remotion of details, and Q6-overall quality

Discussion

Based on Nugroho et al. (2018), this proposal included steps for automatic mask detection and image edge removal through the red channel, using Sobel operator and mathematical morphology. Furthermore, two additional options for the suppression of vessel background consisted of opening operation and vessel enhancement offer better conditions to the following steps, CLAHE and 2D-GWT, normally used in applications to improve image contrast and segmentation.

From the supervised methods, according to Table 1, on the DRIVE database, Liskowski and Krawiec (2016) and Zhu et al. (2017) presented the best performances regarding sensitivity, specificity, accuracy, and balanced-accuracy. On the STARE database, Liskowski and Krawiec (2016) showed the best results in relation to sensitivity and balanced-accuracy and, Staal et al. (2004) and Roychowdhury et al. (2015) for accuracy and specificity, respectively. A disadvantage of supervised methods refers to the need for prior training, which implies an additional cost in computation and classification time.

Considering the unsupervised methods presented in Table 1, which are the research subject in this study, the proposed method presented the best results regarding sensitivity on all databases. Regarding specificity, Sazak et al. (2019) reached the highest rates on DRIVE (herewith Fan et al. (2019)) and on all the other databases. In relation to accuracy, Fan et al. (2019) obtained the best results on DRIVE and Sazak et al. (2019) on STARE and HRF. With regard to balanced-accuracy, Nugroho et al. (2018), Fan et al. (2019), and Sazak et al. (2019) presented the best results on DRIVE, STARE, and HRF, respectively.

In summary, the proposed method presented the best performance in terms of sensitivity without significant losses of other parameters, and the second place for balanced-accuracy on all the databases. A limitation of this method refers to the presence of lesions with dimensions comparable to biggest blood vessels. These lesions are not fully removed by vessel enhancement 2D and, therefore, are accounted as false–positives.

Based on the subjective validation results, images obtained after green channel extraction, CLAHE, and vessel background suppression stages achieved higher levels of acceptance (above 80%). On the other hand, there was a decrease regarding the results of vessel highlighting and detail removal. From the overall quality, one can infer a positive acceptance from the experts consulted.

Conclusion

The main contributions refer to the inclusion of techniques for automatic mask detection, image edge removal, and suppression of vessels background.

In this proposal, the sensitivity had higher priority due to its importance in the early diagnosis of diabetic retinopathy in the proliferative phase when the disease causes the emergence of new vessels in the retina, which grow towards the vitreous interface and may progress to irreversible loss of eyesight (Bosco et al. 2005).

Another contribution of this work is the computer interface “Retinal Lab-A Tool for Fundus Image Analysis”, developed in MatLab® environment and later made available in an executable version at https://github.com/DouglasAbreu/RetinalLab, which allows users to access retinal blood vessel contrast and segmentation adjustments.

In future works, it is expected the usage of broader databases, which will allow training and testing steps in a wider range of images and to incorporate other techniques to the computational platform aiming to extend the detection to other retinal structures and pathology characterization.