1 Introduction

The funduscopic images of the Retina (Abràmoff et al. 2010) are the most valuable resource for analyzing the health of the eye and related diseases. The funduscopic image can uncover the anatomy of the Retina (Patton et al. 2006), such as the Optic Disk (OD), Retinal Vessel Structure (RVS), Macular Region (MR), etc. (Geetha Ramani et al. 2016). It reveals the pathology present in the Retina, such as hemorrhage, several types of exudates and aneurysms, cottonwool spots, etc (Fig. 1). The brightest disk-like part of the fundus image is called Optic Disk. It is the entry point of the central part of the retinal blood vessels. The dark central region of the Retina, surrounding the Fovea, is called the Macula. It is necessary for a clear and sharp vision. The tree-like blood vessel structure found in the Retina, known as RVS, is essential for supplying resources to the cells of the Retina (Franklin et al. 2014).

Fig. 1
figure 1

a Original Funduscopic image. b Fundus image having pathologies

The anatomical changes in the retinal vessel structure are a good prognostic indicator for diverse types of retinal syndromes such as (i) Hypertensive Retinopathy (HR), (ii) Diabetic Retinopathy (DR), and (iii) different types of Occlusions (e.g., Branch Retinal Vein (BRV) and Central Retinal Vein (BRV) occlusion), etc. The retinal vasculature variation also helps detect cardiovascular disease and stroke (Geetha Ramani et al. 2016). The gross deformity in vessel structure can be a cause of vision loss. So, early detection of the changes in RVS, which include thickness, shape, curvature, branching angle between two adjacent essels, and tortuosity, can prevent the progression of many eye diseases. The manual segmentation of the vessel structure requires an expert ophthalmologist, which can be costly and time-consuming. Moreover, the accuracy of this type of segmentation depends upon the knowledge of the ophthalmologist, which means it may be error-prone. The automatic vessel segmentation technique can significantly help an ophthalmologist with the initial screening.

Many important research works are already available in the literature (Almotiri et al. 2018; Khan et al. 2019) on the segmentation of retinal vessel structure. The techniques can be broadly categorized into two following classes.

I. Supervised method: In this approach, a labeled dataset is used to train the classifier to discriminate the vessels and non-vessels from the funduscopic images. This method has mainly two phases such as feature extraction and classification. There are various types of classifiers proposed in the literature. Some classifiers are (i) Neural Network-based, (ii) AdaBoost, (iii) Gaussian Mixture Model, (iv) Support Vector Machine, (v) K-nearest neighbor, etc. Niemeijer et al. (2004) propose K-nearest neighbor-based classifiers to segment the vessels and non-vessels pixels from fundus images. The same technique is used by Staal et al. (2004) for RVS separation, but here a ridge-based detector is used for feature vector separation. Ricci et al. (2007) propose a Support Vector Machine (SVM) based classifier, where the feature vectors are prepared using the pixel’s intensity with the Rotational Invariant Linear Operator (RILO). Tang et al. (2015), SVM based classifier is utilized with the Gabor wavelet and the multiscale vessel filters for feature extraction. Aslani et al. (2016) use a multiscale and multi-rotational Gabor filter. Lupascu et al. (2010) constructed an Adaptive Boosting (AdaBoost) classifier with a 41-dimensional feature set to segment the RVS Memari et al. (2017) use a matched filter and AdaBoost-based vessel separation method for color fundus images. Roy Chowdhury et al. (2014) propose Gaussian Mixture Model, where an 8-dimensional feature vector was used. Thangaraj et al. (2017) apply a neural network-based RVS segmentation technique. This method uses a 13-dimensional feature vector. Yan et al. (2019) constructed a three-layer deep learning-based strategy, where the thin and thick vessels are segmented separately.

The supervised methods are time-consuming and suffer from overfitting problems. Also, the parameters used to train the classifiers depend on the size of the input images.

II. Unsupervised method: In this approach, no labeled data is required for training the system. These methods are mainly rule-based or dependent on thresholding filter-based responses. Some unsupervised methods are based on (i) Multiscale, (ii) Matched filter, (iii) Mathematical morphology, (iv) Adaptive Mathematical Morphology, etc. Annunziata et al. (2016), a Multiscale Hessian (M.H.) based RVS segmentation method is proposed, whereas Gou et al. (2018) use a dynamic multiscale matched filter followed by a dynamic multiscale thresholding approach. An important adaptation of the multiscale line detection technique is found in Yue et al. (2018). The Matched filter-based methods are mainly two distinct types of filter kernel (small and large) to segment the thin and thick vessels. Zhang et al. (2010) propose a Gaussian first-order match filter, Whereas Chaudhuri et al. (1989) use a 2-dimensional Gaussian match filter. The combined filter-based approach is found in (Oliveira 2016). Azzopardi et al. (2013) propose an algorithm they called COSFIRE (combination of Shifted Filter Response) to extract the RVS. Many mathematical morphology (MM) based vessel segmentation methods are found in the literature. Zana et al. (2001) propose the retinal vessels and their curvature evaluation using MM. A fuzzy morphology (black top-hat) based algorithm is found in Bibiloni et al. (2019). Recently lots of Adaptive Mathematical Morphology (AMM) based methods have been proposed, where most of the techniques (Nandy et al. 2020, 2021) are used Adaptive Structuring Element (ASE) to enhance the RVS. The adaptive noise detection and elimination method are determined by Mondal et al. (2017). However, most unsupervised methods are relatively slow and prone to noisy output. The following limitations are primarily found in the recently proposed methods:

  1. i.

    Most of the present methods are unable to accurately segment the tiny vessels due to poor contrast with the background.

  2. ii.

    The curvature of the vessels is inappropriately detected due to the predefined (shape and size) filter kernels.

  3. iii.

    Some noises are sensed as a part of the vessel structure due to improper preprocessing and thresholding techniques.

To address the abovementioned limitations, this paper proposes an unsupervised Adaptive Mathematical Morphology-based segmentation technique named Bel-Hat Transformation (BHT). A brief description of the procedure is as follows. Initially, the vessels of an RGB image are contrast-enhanced by the Local Laplacian Filter (Paris et al. 2015), followed by grayscale conversion of the image by giving maximum weightage to the green channel as the contrast of the blood vessels in the Green Channel is maximum compared to the other channels. Next, a ‘Difference of Gaussian’ filter (DoG) is applied to increase the contrast between the tiny, noisy objects from the edges of the vessels. Following this, the image is Opened by rotating a fixed-sized Line Structuring Elements (LSE) starting from an angle of 0° to 180°. The maximum response is recorded for each pixel of all resulting images. This image in the last step is subtracted from a smoothed version with 2-D Gaussian Structuring Element (2DGSE). This procedure is continued by increasing the size of both the structuring elements. Finally, the maximum response is again recorded amongst all the images produced in the previous steps. The line structuring elements enhance the thick and thin vessels in the respective direction but introduce noise and isolated artifacts. The noises present with binary vessel structure are eliminated using a novel robust statistical threshold based on the frequency of sizes (pixel area) of isolated objects. The output of this method can produce a clear and accurate retinal vessel structure.

The rest of the paper includes the following sections: Sect. 2 elaborates on the details of the proposed methodology, while Sect. 3 demonstrates the experimental results and discussion. Finally, Sect. 4 concludes the paper.

2 Proposed method

This section proposes a unique method for fundus vessel structure segmentation. The proposed method depends on three main stages: preprocessing, vessel structure segmentation, and noise elimination. This research paper introduces a novel unsupervised Adaptive Mathematical Morphology-based technique named Bel-Hat transformation (BHT) for segmenting blood vessels and a robust threshold based on the statistical distribution of isolated objects for removing noise and other artifacts. A description of the step-by-step detail of this algorithm is elaborated in the following subsections.

2.1 Preprocessing

The main aim of this phase of the algorithm is to enhance the input RGB funduscopic image (fRGB) and increase the contrast between the noise and the vessel structure. Due to unpredictable contrast variations in different parts of the input images, contrast enhancement becomes essential before the vessel segmentation. A Local Laplacian Filter (LLF) (Paris et al. 2015) is employed to enhance the vessel structure of the RGB retinal image. The LLF is an edge-aware operator that can enhance the input image using the Laplacian Pyramid (Burt et al. 1983) without introducing halos (Li et al. 2005) and artifacts (Fattal 2009). Equation 1 symbolically depicts this operation.

$${f}_{RGB}^{enh}=LLF\{{f}_{RGB}\}$$
(1)

The result (\({f}_{RGB}^{enh}\)) is transformed to grayscale by keeping 90% of the green channel (G), as the available contrast is maximum between a vessel and non-vessel structures for this channel, and because of its meager information content, only 10% of the red channel (R) is included to prevent the loss of information carried by this channel. In contrast, the blue channel (B) is completely ignored due to its boisterous appearance. Equation 2 shows this conversion.

$${f}_{gray}=0.10\times {f}_{RGB}^{enh}\left(R\right)+0.90\times {f}_{RGB}^{enh}(G)+0\times {f}_{RGB}^{enh}(B)$$
(2)

The enhanced grayscale retinal image (\({f}_{gray}\)) contains considerable noise and some noises attached to the vessel structure. Noise adjacent to the vessels is separated by contrast stretching with DoG (Difference of Gaussian) having standard deviations \({\partial }_{1}\) and \({\partial }_{2}\) respectively (where \({\partial }_{1}>{\partial }_{2}\)) as is shown in Eq. 3.

$${G}_{{\partial }_{1},{\partial }_{2}}\left(s,\mathrm{t}\right)={f}_{gray}*\left(\frac{1}{2\pi {\partial }_{1}^{2}} {e}^{-\frac{({s}^{2}+ {t}^{2})}{2{\partial }_{1}^{2}}}-\frac{1}{2\pi {\partial }_{2}^{2}} {e}^{-\frac{({s}^{2}+ {t}^{2})}{2{\partial }_{2}^{2}}}\right)$$
(3)

Some research works (Nandy et al. 2020, 2021; Mondal et al. 2017) mentioned that the range of the retinal vessel thickness varies from three to seven pixels. So, objects less than 3 pixels can be considered noise. Due to this observation, for enhancing the vessel structure, a morphological Top-Hat transformation uses two disk structuring elements (Bd). The first structuring element (Bd3), with a diameter of 3 pixels, is employed for Morphological Opening to eliminate the unwanted small objects, and the second structuring element (Bd8), with a diameter of 8 pixels, is used for Morphological Closing to join the discontinued vessels, as shown in Eq. 4. As a result, the output image f becomes free from unwanted objects less than 3 pixels and eliminates small discontinuity in the vessels.

$$f\left(s,t\right)=\left({\varphi }_{{B}_{d8}}\left({\gamma }_{{B}_{d3}}\left({G}_{{\partial }_{1},{\partial }_{2}}\left(s,\mathrm{t}\right)\right)\right)-{G}_{{\partial }_{1},{\partial }_{2}}\left(s,\mathrm{t}\right)\right)$$
(4)

where \({\gamma }_{{B}_{d3}}\) and \({\varphi }_{{B}_{d8}}\) respectively represents Morphological Opening and Closing operations.

2.2 Vessel structure separation from the background

This section describes the proposed Bel-Hat transformation (BHT) filter designed to segment the Retinal Vessel Structure (RVS) from its background. Figure 2 diagrametically summarizes this segmentation technique. The BHT filter simultaneously uses two groups of operators (structuring elements) on the image. The first group of operators containing Neighbor Adaptive Line Structuring Elements (NALSE) is symbolically represented by \({B}_{\theta }^{i}\) with two parameters, viz. integer length \(i\in \{3,\dots , 7\}\) and orientation \(\theta \in \left[{0}^{^\circ }, {180}^{^\circ }\right]\). The \({B}_{\theta }^{i}\) is a line structuring element with all pixels values ‘1’ and the origin at the center (Fig. 3a). The second group contain 2-D Gaussian Structuring Element (\({G}_{\sigma }\)) with variance \(\sigma \in [0.5, 1.5]\) as a parameter (Fig. 3c). The size of each Structuring Element (SE) varies from 3 to 8 pixels because the thickness of the vessels belongs to the mentioned range (Nandy et al. 2020, 2021; Mondal et al. 2017). For each specific length \(i\in \{3,\dots , 7\}\), the NALSE is rotated with an increment of \({10}^{\circ}\) in the range \(\theta \in \left[{0}{^\circ }, {180}{^\circ }\right]\), and (morphologically) Opened the image for each of these angles. The maximum response is registered at each pixel from the 18 (180/10) image stack. This method can match the vessels which are longer than ‘\(i\)’ pixels and eliminate the vessels smaller than ‘\(i\)’ pixels along the direction ‘\(\theta\)’. The same procedure is repeated on the output generated in the previous step, with the increased length (\(i+1\)) of the NALSE. A stack of five different output images \({\left[{f}_{line}^{i}\right]}_{ i\in \{3, \dots ,7\}}\) are formed, which is shown in Eq. 5.

Fig. 2
figure 2

Summary of the proposed segmentation technique

$${\left[{f}_{line}^{i}\right]}_{ i\in [3, 7]}=\left[{max}_{\theta \in \left[{0}{^\circ }, {180}{^\circ }\right]}\left\{fo{B}_{\theta }^{i} | \forall \theta \, and \, \theta =\theta +10^\circ \right\} | \forall i\in \{3, \dots ,7\}\right]$$
(5)

Similarly, a group of 2-D Gaussian Structuring Elements (2DGSE(\({G}_{\sigma }\))) with variance(\(\sigma )\) ranging from 0.5 to 1.5, are used for the Morphological Opening of the retinal image (\(f\)) iteratively. To keep the size of the 2DGSE(\({G}_{\sigma }\)) restricted in 3 to 7 pixels (because the thickness of the fundus vessels is in this range), the variance(\(\sigma )\) is restricted in the interval 0.5 to 1.5 with an increment of 0.25, as shown in Eq. 6. The sequential openings using the 2DGSE(\({G}_{\sigma }\)) with varying variance \(\sigma \in [0.5, 1.5]\) produce a stack of five different output images \({\left[{f}_{G}^{\sigma }\right]}_{ \sigma \in [0.5, 1.5]}\). The 2DGSE(\({G}_{\sigma }\)) can eliminate objects less than the size of \({G}_{\sigma }\) and those wider than \({G}_{\sigma }\) are preserved. Equation 7 shows the procedure.

$${G}_{\sigma }=2*\lceil2*\sigma \rceil+1$$
(6)
$${\left[{f}_{G}^{\sigma }\right]}_{ \sigma \in [0.5, 1.5]}=\left\{f \cdot {G}_{\sigma } | \forall \sigma \in [0.5, 1.5]\right\}$$
(7)

Next, the pixel-wise differences are computed between two images, one coming from the stack of \({\left[{f}_{line}^{i}\right]}_{ i\in \{3,\dots , 7\}}\) and the other coming from the stack of \({\left[{f}_{G}^{\sigma }\right]}_{ \sigma \in [0.5, 1.5]}\) with the same length for both structuring elements. After taking the differences, a stack of output images is formed (\({f}_{diff}^{i, \sigma })\), which is shown by Eq. 8.

$${f}_{diff}^{i, \sigma }=\left\{\left|{\left[{f}_{line}^{i}\right]}_{ i\in \{3,\dots , 7\}}-{\left[{f}_{G}^{\sigma }\right]}_{ \sigma \in [0.5, 1.5]}\right|:\forall i\in \{3,\dots ,7\} \text{ } and \text{ }\sigma \in [0.5, 1.5]\right\}$$
(8)

Lastly, the pixel-wise maximum is counted from the stack of images \(({f}_{diff}^{i, \sigma })\) and the resultant image (\({f}_{enh}\)) contains the vessel structure separated from the background, as shown in Eq. 9.

$$f_{{enh}} = \underbrace {{\max }}_{\begin{subarray}{l} i \in \left\{ {3, \ldots ,7} \right\} \\ \sigma \in \left[ {0.5,~1.5} \right] \end{subarray} }\left[ {f_{{diff}}^{{i,~~\sigma }} } \right]$$
(9)

The above algorithm can be explained as follows: the structuring elements 2DGSE(\({G}_{\sigma }\)) are used to Open the image \(f\), so that sharp variation of intensity decreases with suppression of blob-like structures present in the image. On the other hand, Opening with NALSE(\({B}_{\theta }^{i}\)) will not affect the blob-like structures due to their elongated shape. So in the uniform region or where a blob-like object is present, the difference in the opened image will further decrease the intensity level. But if any vessel-like elongated structure is present, the contrast will increase between the vessel and background because the NALSE(\({B}_{\theta }^{i}\)) can enhance the intensity of elongated structures and 2DGSE(\({G}_{\sigma }\)) can suppress the intensity, as explained diagrammatically in Fig. 3.

Now, Local Otsu’s (Otsu 1979) threshold is applied to the enhanced image (\({f}_{enh}\)) for binarization, and the resulting image denoted by \({f}_{bin}\). The output binary image (\({f}_{bin}\)) still contains some residual noise, which is eliminated in the next phase.

2.3 Noise elimination from the binary image

The background noises are eliminated in two steps, as explained below.

2.3.1 Histogram of isolated objects

A Robust threshold is estimated based on the area of the isolated objects’ frequency distribution to separate the unwanted objects from the vessels. Assume that the binary image (\({f}_{bin}\)) having the size of \(M\times N\) pixels, and \({a}_{i}\) denote the area of ith isolated object and \({f}_{i}\) denote the corresponding frequency of the objects with the same area \({a}_{i}\). So, the total area (\(A\)) of the isolated objects is given by the following Eq. 10.

$$A=\sum_{i=1}^{n}{{a}_{i}f}_{i}$$
(10)

Now, the frequencies of the isolated objects of the same area are calculated from the binary retinal image(\({f}_{bin}\)) and the corresponding histogram is shown in Fig. 4.

Fig. 3
figure 3

a Rotation of the Neighbor Adaptive Line Structuring Elements (NALSE, \({B}_{\theta }^{7}\)) with respect to the origin (\({4}{\rm th}\) pixel). b At b(i) and b(ii) The NALSE (\({B}_{\theta }^{7}\)) is not overlapped appropriately, so these are considered non-vessel parts, but at b(iii) \({B}_{\theta }^{7}\) overlapped appropriately and is considered the vessel’s structure; therefore, it is restored. c Adaptive 2-D Gaussian Structuring Element ( 2DGSE, \({G}_{\sigma }\)). d Original intensity profile of a fundus image. e Gaussian structuring element (\({G}_{\sigma }\)) pushed underneath the intensity profile. f Result of the opening using 2DGSE (\({G}_{\sigma }\))

Fig. 4
figure 4

a A magnified region of the binary funduscopic image, (i) shows the vessels, where (ii) to (vi) are the unwanted isolated objects, b shows the histogram of the area of isolated objects

2.3.2 Elimination of noises using robust statistical thresholding

The modes of the statistical distribution (histogram) of the area of the isolated noise and the vessels have a significant gap, as can be easily observed in Fig. 4b. Small isolated noises are concentrated near the origin, and large vessel structures are very scanty and condensed far away. Due to this considerable gap, the mean area will be shifted toward the statistical distribution of vessels. Therefore, the threshold estimation will be erroneously calculated when a thresholding algorithm depends on the mean (of isolated objects). Hence, the median is a more robust choice for estimating location. So, we adopted an automatic thresholding algorithm based on the median of the area distribution for noise separation. The iterative algorithm for thresholding is described below.

Step 1. Select the median (\(M\)) of the frequency distribution of the area of the isolated objects (Fig. 4b) as an initial threshold.

Step 2. The isolated objects of the binary image are categorized into two classes \({C}_{1}\) and \({C}_{2}\) depending on the threshold \(M\); \({C}_{1}\): contains all areas (ai) with values less than or equal to the threshold value (\(M\)), and class\({C}_{2}\): contains all areas (ai) with values more than\(M\), as explained in Eq. 11, where \({a}_{i}\) represent the area of ‘ith’ isolated object.

$$\left\{\begin{array}{c}{i }^{th}\, object \, \epsilon \, {C}_{1} \quad if \quad {a}_{i}\le M\\ {i }^{th} \, object \, \epsilon \, {C}_{2} \quad if \quad {a}_{i}>M\end{array}\right.$$
(11)

Step 3. Compute the Medians \({M}_{1}\) and \({M}_{2}\) for each class \({C}_{1}\) and \({C}_{2}\), respectively.

Step 4. Compute the new threshold by taking the average of the medians \({M}_{1}\) and \({M}_{2}\), as shown in Eq. 12 below.

$$M = 1/2\left( {M_{1} + M_{2} } \right)$$
(12)

Step 5. Repeat steps 2 to step 4 until the absolute difference (\(\Delta M\)) between two consecutive thresholds becomes less than a small quantity \(\epsilon\), as in Eq. 13.

$$\left|\Delta M\right|\le \epsilon$$
(13)

This algorithm is effective when modes of a mixture distribution are far from each other. The number of iterations depends on \(\Delta M\) and the initial value of the threshold (\(M\)), which is the median of the area-based distribution of the binary retinal image. The noises are successfully separated from the vessels using the above median-based thresholding algorithm.

3 Experimental results and discussion

3.1 Data set

The performance of the proposed methodology is evaluated qualitatively and quantitatively using some freely available Funduscopic datasets (Staal et al. 2004; Hoover et al. 2000; Owen et al. 2009; Farnell et al. 2008; Odstrcilik 2013), as summarized in Table 1.

Table 1 Overview of some freely available Funduscopic databases used in this paper

3.2 Results

Figure 5 demonstrates the output of different phases of our algorithm employed for vessel detection on an image found in the DRIVE database. Figure 5a exhibits an original color funduscopic image (fRGB). Figure 5b shows the result (\({f}_{RGB}^{enh}\)) of applying the Local Laplacian Filter on the original image (fRGB). The enhanced RGB image (\({f}_{RGB}^{enh}\)) is converted into a greyscale image (\({f}_{gray}\)) by utilizing the method described in Eq. 2, and the result is shown in Fig. 5c. Figure 5d shows the outcome \({(G}_{{\partial }_{1},{\partial }_{2}})\) of applying DoG on the \({f}_{gray}\), which increases the contrast between the vessels and adjacent noises. A morphological Opening and Closing operation is performed with two disk-shaped structuring elements (Bd) with diameters of 3 pixels and 8 pixels, respectively, followed by a Top-Hat transformation on \({G}_{{\partial }_{1},{\partial }_{2}}\) as described in Eq. 4, the result is shown in Fig. 5e. This process eliminates all the unwanted objects with an area of less than 3 pixels and fills the small gaps between the vessels. Now, to segment the vessel structure and separate the background noise, the Bel-Hat Transformation is used, where the Neighbor Adaptive Line Structuring Elements (\({B}_{\theta }^{i}\)) is utilized iteratively by change of the length (\(i \in \{3,\dots , 7\}\)) of \({B}_{\theta }^{i}\). The NALSE enhances the vessel structure and the resulting output image (\({f}_{line}^{i}\)) is shown in Fig. 5f, where the NALSE is 4 pixels (i.e. \(i\)= 4px.) in length. Similarly, 2DGSE(\({G}_{\sigma }\)) is applied iteratively on the \(f\left(s,t\right)\) to get a smoothed version of the image (\({f}_{G}^{\sigma }\)) in different scales according to the change in the variance (\(\sigma \in [0.5, 1.5]\)), shown in Fig. 5g, with a variance \(\sigma =0.75\). Figure 5h–n shows the resultant enhanced vessel structure (\({f}_{enh}\)), after taking the pixel-wise maximum from the stack (\({f}_{diff}^{i, \sigma }\)) obtained from the differences between the images \({f}_{line}^{i}\) and \({f}_{G}^{\sigma }\), where the value of the length of (\(i\)) in the \({B}_{\theta }^{i}\) and the corresponding variance (\(\sigma\)) in the \({G}_{\sigma }\) are (3px., 0.5), (4px., 0.75), (5px., 1.0), (6px., 1.25), (7px., 1.5), (8px., 1.75), and (9px., 2.0), respectively. After calculation of the average Structural Similarity (SSIM) Index and corresponding Accuracy (Acc) of the output images using the proposed method plotted in Figs. 6 and 7 clearly shows that SSIM and the Accuracy of the output images achieve maximum value when the length (\(i\)) of the \({B}_{\theta }^{i}\) is kept at 7 pixels, and the variance of the \({G}_{\sigma }\) is kept at 1.5. So, these values are considered optimal for the proposed methodology, as shown in Fig. 5l. The image is binarized by applying local Otsu’s (1979), and the result \({f}_{bin}\) is shown in Fig. 5o.

Fig. 5
figure 5

Different vessels detection phases verified on DRIVE dataset a RGB funduscopic image (fRGB), b Enhanced (using Local Laplacian Filter) RGB image (\({f}_{RGB}^{enh}\)). c Converted greyscale image (\({f}_{gray}\)). d Resultant image \({G}_{{\partial }_{1},{\partial }_{2}}\), after applying DoG on \({f}_{gray}\), e Resultant image\(f\left(s,t\right)\), after using morphological Top-Hat transformation on \({G}_{{\partial }_{1},{\partial }_{2}}\). f Output image \({f}_{line}^{i}\), found after applying NALSE (\({B}_{\theta }^{i}\)) with \(i\)= 4 pixels on the \(f\left(s,t\right)\). g Output image\({f}_{G}^{\sigma }\), found after applying 2DGSE(\({G}_{\sigma }\)) with \(\sigma\)= 0.75 on the \(f\left(s,t\right)\), hn Resultant enhanced vessel structure (\({f}_{enh}\)), after taking the maximum from the differences between \({f}_{line}^{i}\) and \({f}_{G}^{\sigma }\), where the value of “\(i\)” and “\(\sigma\)” are (3px.,0.5), (4px.,0.75), (5px.,1.0), (6px.,1.25), (7px.,1.5), (8px.,1.75), and (9px., 2.0) respectively. o Converted enhanced binary image \({f}_{bin}\) after using local Otsu’s threshold. p Final binary vessel structure after eliminating the residual noise using Robust Statistical Thresholding

Fig. 6
figure 6

Variation of the SSIM vs. the size of the Structuring Element (SE) tested on the HRF, DRIVE, CHASE_DB1, and STARE datasets

Fig. 7
figure 7

Variation of the Accuracy vs. the size of the Structuring Element (SE) tested on the HRF, DRIVE, CHASE_DB1, and STARE datasets

Fig. 8
figure 8

Input funduscopic images and the corresponding output images verified on different databases ac are the RGB funduscopic images (fRGB) randomly taken from the CHASE_DB1, STARE, and HRF databases respectively, df are the output images after applying the proposed algorithm on (a), (b) and (c), respectively

Lastly, Fig. 5p shows the clear binary vessel structure after eliminating the residual noise using the Robust Statistical Threshold. Figure 8 shows the final results after applying the proposed algorithm to some images found in the CHASE_DB1, STARE, and HRF databases.

3.3 Evaluation matrics

3.3.1 Evaluation matric based on structural similarity (SSIM) index

The SSIM index determines the degradation between the output image found after applying the proposed algorithm and the corresponding ground truth image. The SSIM index is applicable where the image pixels are strongly correlated and spatially closed. This matric has many applications in the field of image processing and the media industry. To calculate the SSIM index, a fixed-sized (\(N\times N\)) window is chosen around each pixel from the two images for calculating various local statistical measures. Equation (14) shows the formula to calculate the SSIM index of the two corresponding images \(u\) and \(v\). Where, \({\mu }_{u}\) and \({\mu }_{v}\) are the local means of the images \(u\) and \(v\) respectively, for each pixel. \({\sigma }_{u}^{2}\) and \({\sigma }_{v}^{2}\) are the local variance of \(u\) and \(v\) respectively. \({\sigma }_{uv}\) represent the local covariance in between the \(u\) and \(v\). Lastly, there are two variables \({k}_{1}\) and \({k}_{2}\) are found for stabilization of the division.

$$SSIM\left(u,v\right)=\frac{(2{\mu }_{u}{\mu }_{v}+{k}_{1})(2{\sigma }_{uv}+{k}_{2})}{({{\mu }_{u}}^{2}+{{\mu }_{v}}^{2}+{k}_{1})({{\sigma }_{u}}^{2}+{{\sigma }_{v}}^{2}+{k}_{2})}$$
(14)

3.3.2 Evaluation matric based on confusion matrix

The proposed algorithm’s outputs (segmented binary vessel structure) are classified (pixel-based) into two groups. The pixels of the output images are either part of the vessels or the background (non-vessels). The classifications contain two truly classified groups and two misclassified groups. Consequently, the predicted pixels are part of the following four categories, (a) TP (true positive), where the pixels are classified as a part of the vessels in the segmented images and the ground truth correctly. (ii) TN (true negative), where the pixels are correctly identified as a part of the non-vessels structure in both the segmented fundus image and the corresponding ground truth image. Two misclassified categories contain. (iii) FP (false positive), here the pixels are recognized as vessels by the segmentation technique; but actually, it is not part of the vessel structure of the corresponding ground truth.

(iv) FN (False Negative), where the segmentation technique marked the pixels are not of the part of the vessels, but in reality, it is the part of the vessels. All these four categories are shown in Table 2.

Table 2 Confusion matrix used for the estimation of the performance of the proposed algorithm
Table 3 Average performance calculation of the proposed method evaluated on the different Databases (Spe: Specificity, Sen: Sensitivity, Acc: Accuracy, F1: F1-Score, IOU: Intersection Over Union)

This paper uses the five most commonly applicable matric based on the confusion matrix for the evaluation of the proposed algorithm to other state-of-art techniques. The metrics are as follows:

  1. (i)

    Accuracy (Acc)

  2. (ii)

    Sensitivity (Sen),

  3. (iii)

    Specificity (Spe),

  4. (iv)

    F1-Score (F1)

  5. (v)

    Intersection Over Union (IOU)

Equation (15) measures the True Negative Rate (TNR) or Specificity, which means how the proposed technique predicts the negative class, where each pixel of the output images falls into the negative class.

$$Specificity \left(Spe\right)=\frac{TN}{TN+FP}$$
(15)

Equation (16) measures the True Positive Rate (TPR) or Sensitivity, which means how the proposed technique predicts the positive class when each pixel of the output images falls into the positive class. Equation (17) shows the correctness (Accuracy) of the proposed method.

$$Sensitivity \left(Sen\right)=\frac{TP}{TP+FN}$$
(16)
$$Accuracy \left(Acc\right)=\frac{TP+TN}{TP+TN+FP+FN}$$
(17)

Equation (18) shows the measurement of the F1 Score of the proposed method, which is a Harmonic Mean (HM) of the Precision (\(=TP/(TP+FP)\)) and the Sensitivity. F1 Score combines Precision and Sensitivity, so it can be applied to measure the performance of the different methods proposed in the literature. Equation (19) shows the Intersection Over Union (IOU) measure, a ratio between the intersection and union of the predicted vessel structure and given ground truth. Figure 9 shows the Performance evolution (Sensitivity, Specificity, Accuracy, F1-Score, and Intersection Over Union) on different databases (Table 3).

Fig. 9
figure 9

Performance evolution (Sensitivity, Specificity, Accuracy, and F1-Score) on different datasets (a) DRIVE (b) STARE (c) CHASE_DB1 (d) HRF datasets

$$F1 \, Score\left(F1\right)= \frac{2\times \mathrm{Precision}\times \mathrm{Sensitivity}}{\mathrm{Precision}+\mathrm{Sensitivity}}=\frac{2\times TP}{2\times TP+FP+FN}$$
(18)
$$Intersection \,Over \, Union \, (IOU)=\frac{TP}{TP+FP+FN}$$
(19)

Tables 4, 5, 6, and 7 shows the performance of the proposed methodology compared with other RVS segmentation methodologies evaluated on the DRIVE, HRF, CHASE_DB1, and STARE databases, respectively.

Table 4 The performance of the proposed methodology is compared with other RVS segmentation methodologies evaluated on the DRIVE database
Table 5 The performance of the proposed methodology is compared with other RVS segmentation methodologies evaluated on the HRF database
Table 6 The performance of the proposed methodology is compared with other RVS segmentation methodologies evaluated on the CHASE_DB1 database
Table 7 The performance of the proposed methodology is compared with other RVS segmentation methodologies evaluated on the STARE database

4 Conclusion

This study proposes an unsupervised Adaptive Mathematical Morphology-based technique that simultaneously applies two groups of structuring elements (NALSE and 2DGSE) to eliminate the noise and enhance the vessel structure iteratively with minimal deformity. A robust threshold is proposed based on the frequency distribution of the area of the isolated objects (statistical Mixture Distribution Model), that discriminates the residual noises from the original vessels. The experimental results demonstrate that this algorithm segments the blood vessels from the fundus images of the Retina with excellent precision. This novel technique has achieved superior Accuracy (DRIVE: 0.9808, STARE: 0.9789, HRF: 0.9810, CHASE_DB1: 0.9807), Specificity (0.9874, 0.9857, 0.9873, 0.9869), Sensitivity (0.9410, 0.9377, 0.9446, 0.9414), F1-score (0.9335, 0.9271, 0.9363, 0.9303) and Intersection Over Union (0.8754, 0.8643, 0.8809, 0.8697) level after applying on the images available in the different databases (DRIVE, STARE, HRF, and CHASE_DB1, etc.). As mentioned earlier, the results clearly show that the proposed method can segment the fundus vessel structure accurately and efficiently without any intervention from ophthalmologists or experts.