Keywords

1 Introduction

Segmentation is an important part in image processing; it is the division of an image into regions or categories with similar attributes which is useful for image analysis and interpretation. Segmentation has many techniques such as: Edge detection [1] which attempts to capture the significant properties of objects in the image, Fuzzy c-means (FCM) clustering [2] which classifies an image by grouping similar data points in the feature space into clusters, Multilevel Thresholding [3] which segments a gray-level image into several distinct regions, and there are a plenty of techniques. Segmentation is used in many applications in many fields; it can be used in identification and detection of population, objects or animals.

Fish are a class of aquatic vertebrates [4]; they are different from all other animals as their gills and fins, also they spend all of their lives in the water and most of them are cold-blooded. Scientists believe that there are more than 24,000 different species of fish in the world. They range in size from the largest, Whale shark at 16 m (51 ft) long, to the smallest the 8 mm (1/4 in.) stout infantfish. Using taxonomy ontology animals could be classified into hierarchical categories in a scientific methodology. Taxon is the basement of taxonomy, each taxon in the taxonomic tree has a top-to-bottom description to identify its hierarchical information which contains several concepts, known as Kingdom, Phylum, Class, Order, Family, Genus and Species.

Unfortunately there are not many automatic fish segmentation methods, most of them are partially manual. In this paper, we presented five fish segmentation methods; such as the Grabcut algorithm [5], Otsu thresholding method [6], Edge detection technique, Mean-shift method and Region-Growing algorithm. Moreover, all the presented segmentation methods are evaluated on dataset which contains about 270 fish species from natural scenes.

The remainder of this paper is organized as follows. Section 2 provides the related work. The Materials and methods used in this paper are presented in Sect. 3. Section 4, introduces the experimental results. Finally, Sect. 5 concludes the paper and suggests some directions for future studies.

2 Related Work

In this section, we will introduce briefly studies that related to our work. There are plenty of methods that depend on segmentation using Grabcut algorithm, Hernández et al., have used Grabcut to propose a full automatic Spatio-Temporal human segmentation methodology [7], they used a HOG-based person detector, face detection, and skin color model to initialize Grabcut seeds. Prakash et al., proposed a novel formulation for integrating Grabcut with Active-contour [8] to obtain an automatic foreground object segmentation, they depended on that the Active Contour cannot remove the holes in the interior part of the object. On other hand, Grabcut produces poor segmentation results in cases when the color distribution of some part of the foreground object is similar to background. So they proposed a segmentation technique, Snakecut, based on a probabilistic framework that provides an automatic way of object segmentation. Parkhi et al., segmented the foreground (pets) and background by Grabcut [9] this was done by using cues from the over-segmentation of an image (super pixels).

Chuang et al., proposed an automatic segmentation algorithm for fish sampled by a trawl-based underwater camera system [10], they achieved a 78% recall against the ground truth on the successful segmentation of fish, under very low-contrast underwater images. Li et al., presented a method to identify fish spices [11], they used basic image processing techniques to segment fish from background, and the used dataset was four fish respectively of chub, crucian, bream fish and carp, true color images obtained by digital camera. Takeshi Saitoh et al., introduced a fish image recognition method using feature points for fish images with complicated backgrounds [12], the feature points are four points: mouth, dorsal fin, caudal fin and anal fin. Each of these points is manually provided by the user and is designed as characteristic locations to avoid incorrect input by users. Storbeck et al., proposed a classification system for underwater video analysis [13], they defined a new method to recognize a large variety of underwater species by using a combination of affine invariant texture and shape features. Hu et al., presented a novel method of classifying species of fish based on color and texture features using a multi-class support vector machine (MSVM) [14].

3 Materials and Methods

3.1 Dataset Description

The used dataset was collected from http://fishesofaustralia.net.au/, it contains 270 images each image represent a different species. The dataset includes fish images from 2 classes, 7 orders, 25 families and 98 genus, Fig. 1, shows hierarchical classification of fish and Fig. 2, shows samples from the used dataset.

Fig. 1.
figure 1

Hierarchical classification of fish.

Fig. 2.
figure 2

Samples from the dataset.

3.2 Segmentation Methods

In this section, five segmentation methods are introduced and these methods were performed individually.

Segmentation Using Grabcut Algorithm. In this section Grabcut algorithm was used to segment fish from background. Grabcut is a foreground extraction algorithm that can be used when foreground and background color distributions are not well separated. It is based on graph cuts and works by specifying a bounding box around the object to be segmented, in our case we use the whole image as a bounding box, the algorithm estimates the color distribution of the target object and that of the background using a Gaussian mixture model [15]. To minimize the process time and optimize the algorithm quality the input images were resized to 256 * 256, Fig. 3, shows input images after resizing.

Grabcut algorithm is applied to images in RGB (Red, Green and Blue) color space and it is applied to images 4 times, each time images are flipped horizontally, vertically and horizontally-vertically then taking the intersection between these 4 images, Fig. 4, shows Grabcut algorithm results. To remove unwanted shapes from images, some of morphological operators were used such as: opening which erodes away the boundaries of foreground object, it is useful for removing small white noises, and Closing which useful in closing small holes inside the foreground objects, or small black points on the object.

Fig. 3.
figure 3

Input images after resizing.

Fig. 4.
figure 4

From left to right: performing Grabcut on the original image, after flipping the image vertically, after flipping horizontally, after flipping vertically-horizontally and the intersection between the 4 images.

Segmentation Using Otsu Thresholding Method. Otsu thresholding method is used to convert a gray image to binary image, it assumes that the image contains two classes of pixels (foreground pixels and background pixels), it calculates the optimum value which separating the image foreground from background from a bi-modal histogram. Before applying Otsu, images were converted to HSV color space which stands for Hue, Saturation and Value. Because fish in all images are obvious in value component which was used in thresholding as the input image; Fig. 5, shows images in value component.

Before applying Otsu thresholding a blur operations was performed for the input image, then the histogram of the blurred image was calculated, the followed step was normalize the calculated histogram using Eq. 1, then the cumulative sum was calculated for the normalized histogram values, using the previous values, Mean is calculated from Eq. 2, and Variance is calculated from Eq. 3, then the threshold level calculated by multiplying the variance value and the cumulative value. Figure 6, shows the calculation of the threshold value; Fig. 7, shows images after applying Otsu thresholding.

$$\begin{aligned} Normalized Histogram= \frac{H}{Max(H)} \end{aligned}$$
(1)
$$\begin{aligned} M = \frac{(W * I)}{C} \end{aligned}$$
(2)
$$\begin{aligned} Var = I * [W - M]^2 \end{aligned}$$
(3)

Where H represents the calculated histogram values, W for the histogram weight, I for intensity values, C for cumulative values and M for the mean. After applying Otsu thresholding method, some of the above mentioned morphological operators were used in addition to boundary removal operator which removes any component touches the image boundaries; also min-area operator was used to remove small shapes in the image.

Fig. 5.
figure 5

Images in value component.

Fig. 6.
figure 6

The calculation of the threshold value, the red line in the histogram represents the threshold value.

Fig. 7.
figure 7

Images after Otsu thresholding.

Segmentation Using Edge Detection. Edge detection is an image processing technique; it is used to find objects boundaries inside the image by detecting discontinuities in brightness. To achieve high results, images first were converted to HSV color space and the V component of HSV was used as an input image for edge detection. After applying edge detection, some of morphological operators were used like boundary removal operator which removes any component touches the image boundaries; also min-area operator was used to remove small shapes in the image. To overcome the uncompleted boundary paths problem, we used a boundary closing algorithm; Fig. 8 shows the result of applying boundary closing algorithm.

Fig. 8.
figure 8

The result after applying boundary closing algorithm.

Segmentation Using Mean-Shift. Mean shift is the most powerful clustering technique which is very useful for damping shading or tonality differences in localized objects; it is used in many fields such as image segmentation, clustering, visual tracking and space analysis. Because fish are more contrasted in the Value component, images were converted to HSV color space. Then the mean shift algorithm is applied, Edge detection is applied to extract fish from the background.

Segmentation Using Region Growing. In this section, Region Growing method was used. In general, Region-based methods compare one pixel with its neighbors. If a similarity criterion is satisfied, the pixel can be set belong to the cluster as one or more of its neighbors. In order to achieve high results, the center coordinates of the fish were entered manually by the user and then the region growing algorithm returns the segmented fish body.

4 Results and Discussions

This section presents the results for the five segmentation methods used in this paper such as Grabcut algorithm, Otsu thresholding method, Edge detection technique, Mean-shift method and Region-Growing algorithm. Figures 9, 10, 11 and 12, show the results from theses methods.

Fig. 9.
figure 9

First row represents images before segmentation; second row represents images after segmentation using Grabcut.

Fig. 10.
figure 10

First row represents images before segmentation; second row represents images after segmentation using Otsu thresholding.

Fig. 11.
figure 11

First row represents images before segmentation; second row represents images after segmentation using Edge detection.

Fig. 12.
figure 12

First row represents images before segmentation; second row represents images after segmentation using Mean-shift.

In order to evaluate the segmentation methods three evaluation criteria such as RMSE, PSNR and SSIM [16] are utilized. RMSE is the root mean square deviation which is used as a measure of the difference between values, SSIM is the structural similarity index method which is used for measuring the similarity between two images and PSNR is the peak signal to noise ratio which is used to measure the quality of reconstruction of lossy compression codecs. RMSE is calculated from Eq. 4, SSIM is calculated from Eq. 5, and PSNR is calculated from Eq. 6. Table 1, shows the comparison results for the all segmentation methods. Generally, Table 1 shows that the Grabcut algorithm outperforms in terms of RMSE and PNSR over the compared segmentation methods.

$$\begin{aligned} RMSE = \sqrt{sum_{i=1}^{M} sum_{j=1}^{Q} (Org(i,g) - Seg(i,j))^2} \end{aligned}$$
(4)

Where Org is the original image and the segmented image is Seg.

$$\begin{aligned} PSNR = 20 * \log _{10} \frac{255}{RMSE} \end{aligned}$$
(5)
$$\begin{aligned} SSIM(Org, Seg) = \frac{(2\mu _{Org}\mu _{Seg} + C_{1})(2\sigma _{Org,Seg} + C_{2})}{(\mu ^2_{Org} + \mu ^2_{Seg} + C_{1})(\sigma ^2_{Org} + \sigma ^2_{Seg} + C_{2})} \end{aligned}$$
(6)

Where \(\mu _{Org}\) and \(\mu _{Seg}\) are the images mean intensity of the original and the segmented images, \(\sigma ^2_{Org}\) and \(\sigma ^2_{Seg}\) indicates the standard deviation of both images, \(\sigma _{Org,Seg}\) represents the covariance of the both images and \(C_{1}=6.5025\) and \(C_{2}=58.52252\) as constants.

Table 1. Comparison between all segmentation methods in terms of RMSE, PSNR and SSIM.

5 Conclusion and Future Work

In this paper, five segmentation methods are presented to detect and segment fishes from natural images even with different circumstances. To verify the evaluation of the theses method, the segmentation of a set of images was performed. The tests have been done on synthetic and real images (Fish dataset). These images have been chosen to test the ability of all five methods to segment fish which is difficult to discern, in presence of noise with any number of classes. The experimental results showed that the segmentation quality obtained by the Grabcut algorithm is satisfactory and better than the other methods. Also, It may be noted that the computation time of the Grabcut algorithm is independent of the size of the image and the number of iterations. It achieved the best results in segmentation. For future studies, automatic fish classification by color, texture is still required to be studied in the future. Also, it is worth to investigate the fish species classification by color, texture based on machine learning and meta-heuristic optimization algorithms.