1 Introduction

Images which are captured using sound as a source in water medium is referred as underwater acoustic images. As sound can travel a long distance without attenuation in water compared to light it is considered as the best source in the water medium. There are many acoustic instruments available to travel inside the ocean to sweep the seafloor and to detect the happenings inside it. Among the instruments, side scan sonar is the best to make survey on the sea bed and to capture the scene of the seafloor. Sonar images are gaining more importance in archaeology. The applications of side scan sonar include fish habitat mapping, underwater sunken object location, underwater obstruction review, cable and pipeline surveys etc. The side scan sonar consists of three major components such as tow fish, top processing unit and connecting cable. The sonar when immersed in sea uses the sound pulses to propagate through the water surface and based on the reverberation [1], the images are formed which can be viewed in the top processing unit. Underwater acoustic images are often dominated by noise, which obscures signals of interest. The acoustic images often contain two types of region namely objects and their shadows [2]. When the spatial character of the noise differs from that of the signal, image processing techniques can be employed to enhance the image. The images obtained from sonar may suffer from speckle noise, a characteristic effect that contribute to the visual distortion. Speckle may cause difficulties for image interpretation. Speckle degrades the quality of the images by reducing the ability of a human observer to discriminate the fine details of the images such as edges and other features. Images with speckle noise will result in reducing the contrast of image and difficult for performing edge detection, segmentation etc. The objective of any noise removal filtering techniques is to remove any type of noise by retaining the important features of the images.

2 Related work

The images obtained by the side scan sonar are usually affected by noise. The noise that affects the sonar images are called speckle noise. They are originated as a result of interference of sound waves that reflects from the objects to the sonar. Before any image processing like object detection, segmentation these noise needs to be removed. The acoustic images are enhanced using multi-resolution techniques [3, 4] such as image pyramid and wavelet transformation. For removal of these speckle noise various methods also have been adopted. The guided filter is used to denoise the image using a guidance image which also used to preserve the edges [5]. The noisy image is decomposed into several sublevels using symlet wavelet transform. On applying low pass filter the original image is decomposed into a smooth image which on further decomposition with high pass filter divide the image into vertical, horizontal and diagonal levels. It is shown that it works better than the median filter [6]. The literature provides a new technique of constructing a morphological inclusion tree using Trajan’s union-find algorithm which is useful in speckle spotting. The speckle noise is removed by pruning the tree of shapes and reconstructing the image with pixels of ancestor sub trees [7]. A recursive filter called Kalman filter is used in removing the speckle noise in sonar images which predicts the current pixel with the help of estimated pixel and the kalman gain [8]. A hybrid algorithm for denoising the ultrasound medical images [9] were also proposed. An edge preserving interpolation method for medical images have been proposed by denoising the image [10]. Related works are carried out which uses a combination of fuzzy weighted filter and the kalman filter to reduce the noise [11]. A hybrid algorithm which is a combination of guided filter, speckle reducing bilateral filter and non local mean filter is used to remove the noise and preserve the edges [11]. A comparative study on the selection of mother wavelets for denoising the ultrasound images are done in [12]. The literature uses homogeneity index to identify the speckle affected areas. It decomposes the noisy image into local patches which is used in classifying the homogeneous and heterogeneous areas of the image. Many methods have been used including the sparse representation and non - destructive evaluation for speckle noise removal [13, 14]. Once the speckle affected patch is identified it is decomposed using principle components, the clean coefficients are estimated and the image is reconstructed [15]. The noise level of the image is estimated using a patch based algorithm and the denoising is done using non blind denoising [16]. This literature presents a Variational approach which provides a primal and dual solution to solve the variational minimization problem [2]. For speckle noise removal a non local means based speckle filtering was used [17] which preserved meaningful details such as edges and fine features. By using the theory of positive definite densities the generalised Cauchy filter is derived [18]. This filter is used for image denoising by defining the optimal parameters using particle swarm optimisation. An optimal threshold is obtained for each subband of the image after applying the non subsampled contourlet transform [19]. A successive quadratic programming (SQP) optimization is also used. Using non local means modelling the image were denoised [20]. A penalty based method using spatial tonal filters and smoothers [21] are used for image denoising. From a category specific image database [22], the clean images are first selected and using the spatial locality the noisy patches are identified to denoise the images. Using this literature survey, we have identified filters that suits the acoustic images for denoising.

3 Speckle noise

In acoustic images, the noise content is multiplicative and additive. Additive noise are systematic in nature, and can be easily identified and removed. Multiplicative noise, on the other hand, is generally more difficult to remove than additive noise, because the intensity of the noise varies with the image intensity. They are image dependent and complex to model. Speckle is not a noise in an image but noise like variation in contrast. It occurs when a sound wave pulse arbitrarily interferes with the small particles or objects on an scale comparable to the sound wavelength.

The speckle noise has the general form as represented in Eq. (1)

$$y(i,j) = x(i,j)n(i,j) + a(i,j)$$
(1)

where \(y(i,j)\) is the noisy image, \(x(i,j)\) is the original image, \(n(i,j)\) and \(a(i,j)\) represents the multiplicative and additive noise respectively. The additive noise is negligible in the images so the methods used should concentrate only on the multiplicative noise.

A single scale spatial filter is an image operation where each pixel value I(m,n) is changed by a function of the intensities of pixels in a neighborhood of (m,n). Spatial filtering is done by convolution operation. Convolution is a neighborhood operation that uses the matrix of input pixels weights as kernel. A convolution kernel is a correlation kernel that has been rotated 180 degrees. There are many spatial filters used for both smoothing and sharpening the images. The smoothing filters such as median, Lee, Wiener, frost, guided, bilateral are used to remove the noise.

4 Single scale spatial filters

4.1 Median filters

Median filters are widely used for removing the noise by preserving the edges. The median m of a set of values is such that half the values in the set are less than or equal to m and half are greater than or equal to m. In order to obtain the median of the neighbourhood, the values in the filter are sorted. The median value is assigned to the output image. The principal of median filter is to force points with distinct intensity levels to be more like their neighbourhood pixel.

The implementation of median filter is consistent across the pixels of the image. Thus it is used to denoise the image with the expense of distorted features and over smoothening of fine details present in the image.

The steps for median filter is given by,

  1. 1.

    A kernel of size 3 × 3 (or 5 × 5 etc.) region centred around the pixel (i, j) is selected.

  2. 2.

    The intensity values of the pixels in the region are sorted in ascending order.

  3. 3.

    The middle value is selected as the new value of pixel (i, j).

4.2 Lee filters

Lee filters are used to smooth speckle noise data that have an intensity related to the image scene and that also have an additive and / or multiplicative component. Lee filtering is a standard deviation based (sigma) filter that filters data based on statistics calculated within individual filter windows. The pixel being filtered is replaced by a value calculated using the surrounding pixels. It also uses MMSE to filter the image since it assumes that speckle noise is distributed equally in all regions. Thus it is not suitable for images that have sudden changes in pixels like the edges. This results in blurring of edges. Lee filter is given by

$$\hat{R}\left( t \right) = \bar{I}\left( t \right) + W\left( t \right)\left( {I\left( t \right) - \bar{I}\left( t \right)} \right)$$
(2)

where \(W\left( t \right) = 1 - \frac{{C_{v} }}{{C_{I} }}\) and \(\hat{R}\left( t \right)\) represents the filtered image, \(\bar{I}\left( t \right)\) represents the mean value of I(t), \(C_{v}\) represents the variance coefficient of noise affected image, \(C_{I}\) represents the variance coefficient of noise free image.

4.3 Wiener filters

It is based on the statistical approach which filters out the noise present in the image. It performs optimal trading between noise smoothing and inverse filtering which removes the blurring and the additive noise present in the image. Since they work in frequency domain they are comparatively slow. Wiener filter is given by

$$f\left( {u,v} \right) = \left[ {\frac{{H(u,v)^{*} }}{{H(u,v)^{2} + \left[ {\frac{Sn(u,v)}{Sf(u,v)}} \right]}}} \right]G(u,v)$$
(3)

where H(u,v) represents degradation function, G(u,v) represents degraded image, Sn(u,v) represents power spectra of noise, Sf(u,v) represents power spectra of original image.

4.4 Frost filters

Use Frost filters to reduce speckle while preserving edges in radar images. The Frost filter is an exponentially damped circularly symmetric filter that uses local statistics. The pixel being filtered is replaced with a value calculated based on the distance from the filter centre, the damping factor, and the local variance. It is also used to remove multiplicative noise from images. This filter is based on the coefficient of variance which is the ratio of the local standard deviation to the local mean of the degraded image.

Frost filter is given by,

$$FF = \mathop \sum \limits_{n \times n} K\alpha e^{ - \alpha \left| t \right|}$$
(4)

where, \(\alpha = \left( {\frac{4}{{n\bar{\sigma }^{2} }}} \right)\left( {\frac{{\sigma^{2} }}{{\bar{I}^{2} }}} \right)\)

K represents normalised constant, \(\bar{I}\) represents local mean, σ represents local variance, \(\bar{\sigma }\) represents image coefficient of variation value, n represents moving kernel size,

$$\left| t \right| = \left| {X - X_{0} } \right| + \left| {Y - Y_{0} } \right|$$

4.5 Guided filter

The guided filter uses a reference image called guidance image to filter the noise from the image. The guidance image can be either the same input itself or another image. The use of another image as guidance is that the features of guidance image are added to the original image. Using same input as the guided image is called self-guided which is more edge-preserving while removing the speckle noise. It has many applications such as detail smoothing/enhancement, HDR compression, image matting / feathering, haze removal, and joint up sampling. The time complexity for both gray-scale and colour images is O(N).

The guided filter formula is given by

$$q_{i} = a_{k} I_{i} + b_{k } \quad \forall \; i \in \omega_{k}$$
(5)
$$a_{k} = \frac{{\frac{1}{\left| \omega \right|}\mathop \sum \nolimits_{{i \epsilon \omega_{k} }} I_{i} p_{i} - \mu_{k} \bar{p}_{k} }}{{\sigma_{k}^{2} + \in }}$$
(6)
$$b_{k} = \bar{p}_{k} - \mu_{k} a_{k}$$
(7)
$$\bar{p}_{k} = \frac{1}{\left| \omega \right|}$$
(8)

where \(\mu_{k}\) represents mean of image, \(\sigma_{k}^{2}\) represents variance of image, \(\omega_{k}\) represents the window centred at pixel k.

4.6 Bilateral filter

Bilateral filter uses a technique which uses both distance between the pixels and the intensity variations of the image. Thus unlike other filters it is a combination of domain filter and the range filter. Mathematically it is the product of domain and range filter. Thus if one of the weights is close to zero, smoothing doesn’t occur. It may split the image into two namely filtered images and the residual image. The residual image contains the details or noise that are filtered by the filter. Though it is more edge preserving it is very expensive algorithm. It depends on two parameter σs and σr. As the range parameter σr increases, the bilateral filter gradually approximates Gaussian convolution. Increasing the spatial parameter σs smoothes larger features. Bilateral filter is given by

$$BF\left[ I \right]_{p} = \frac{1}{{W_{p} }}\mathop \sum \limits_{q \epsilon S} G_{{\sigma_{s} }} \left( {p - q} \right)G_{{\sigma_{r} }} \left( {I_{p} - I_{q} } \right)I_{q}$$
(9)
$$W_{p} = \mathop \sum \limits_{q \epsilon S} G_{{\sigma_{s} }} \left( {p - q} \right)G_{{\sigma_{r} }} \left( {I_{p} - I_{q} } \right)$$
(10)

where Ip represents image value at pixel position p, S represents the set of all possible image locations, \(p - q\) represents Euclidean distance between the pixels p and q, \(G_{{\sigma_{s} }}\) represents the spatial Gaussian, \(G_{{\sigma_{r} }}\) represents the range Gaussian, Wp represents the normalization factor.

Many researchers have proved that the spatial filters remove noise from the distorted generic images. There are many types of noise that affects the images. They are Gaussian noise, Salt and pepper noise, periodic noise, uniform noise, Anisotrophic noise etc. But the characteristics of acoustic images are different from generic images. As the acoustic images are captured using sound as a source, the only noise available in the images would be speckle noise. There are many filters available both in the spatial and frequency domain for noise removal. In this paper, we have identified the filters that tend to reduce the speckle noise. The speckle noise is multiplicative in nature and on removal of it using spatial filter would affect the quality of the images. In order to reduce the speckle noise, we have used median filter, Lee filter, Wiener, Frost, guided and bilateral filter. We have also proved that bilateral filtering followed by guided filtering would denoise the image and also retain the quality of the images by preserving the features. The performance measures such as MSE, PSNR, SSIM, and Entropy proves that the filtering method shows good result.

5 Experimental results

Underwater acoustic imaging systems are generally useful for either classifying objects or observing the details of objects, usually from some form of underwater vehicle. In this work, side scan sonar is used to obtain two dimensional image of seashore and objects under water. Generally acoustic images are affected by speckle noise. It is a granular noise which reduces the resolution of the image and detectability of the object. Speckle is due to the random interference of wavelets scattered by the microscopic fluctuations of an optically rough object surface.

The speckle noise from the acoustic image was removed by various filters like median filter, Lee filter, Wiener, Frost, guided and bilateral filter. The main disadvantage about these traditional speckle reduction filters are, they eradicate the weak and diffused edges which make the acoustic images harder to interpret especially for investigating finer details of images. In this work, it is evident that bilateral filter followed by guided filter when applied to acoustic images remove speckle noise without affecting the edges.

6 Performance measures

6.1 Mean squared error (MSE)

MSE is one of the measures used to evaluate the quality of images. It is the cumulative squared error between the original image and the processed image. It mainly depends on the intensity scaling of the image and is used to calculate another important measure called PSNR. The value gives the amount by which the original image differs from the noisy image. MSE is given by

$${\text{MSE}} = \frac{1}{\text{mn}}\mathop \sum \limits_{0}^{{{\text{m}} - 1}} \mathop \sum \limits_{0}^{{{\text{n}} - 1}} \left| {\left| {{\text{f}}\left( {{\text{i}},{\text{j}}} \right) - {\text{g}}\left( {{\text{i}},{\text{j}}} \right)} \right|} \right|^{2}$$
(11)

where, f represents original image, g represents processed image, m,n represents the dimensions of the image.

6.2 Peak signal to noise ratio (PSNR)

PSNR is an estimator which estimates the quality of the image after processing the image like compression, denoising, etc. PSNR is measured in decibels (dB). Unlike MSE, PSNR measures the peak error. The higher the PSNR, the better the quality of the reconstructed image. PSNR is given by

$${\text{PSNR}} = 20 {\text{log}}_{10} \left( {\frac{{{\text{MAX}}_{\text{f}} }}{{\sqrt {\text{MSE}} }}} \right)$$
(12)

where, MAXf represents maximum signal value, MSE represents the mean squared error

6.3 Structural SIMilarity Index (SSIM)

SSIM is used to measure the similarity between two images. Among the two images one must be the original image and the other must be the reference image. It cannot predict which image is better but can measure the structural similarity between the two images. SSIM is given by

$${\text{SSIM}}\left( {\text{x,y}} \right) = \frac{{\left( {2\upmu_{\text{x}}\upmu_{\text{y}} + {\text{c}}_{1} } \right)\left( {2\upsigma_{\text{xy}} + {\text{c}}_{2} } \right)}}{{\left( {\upmu_{\text{x}}^{2} +\upmu_{\text{y}}^{2} + {\text{c}}_{1} } \right)\left( {\upsigma_{\text{x}}^{2} +\upsigma_{\text{y}}^{2} + {\text{c}}_{2} } \right)}}$$
(13)

where, µx-the average of x, µy-average of y, σ 2 x -the variance of x, σ 2 y -the variance of y, σxy-the covariance of x and y, c1 and c2 to stabilize the division with weak denominators.

6.4 Entropy

Entropy of an image is the quantity that is used to describe the amount of information contained in the image. An image with low entropy denotes that it is flat or it has too many black pixels or with very low contrast. Entropy is given by

$${\text{H}} = - \mathop \sum \limits_{\text{k}} \,{\text{p}}_{\text{k}} { \log }_{2} \left( {{\text{p}}_{\text{k}} } \right)$$
(14)

where, K represents the number of gray levels, Pk represents the probability associated with gray level k. Figure 1. shows the results of various filtering methods used for removing the speckle noise from the acoustic images.

Fig. 1
figure 1

Results of various filtering method for acoustic images

7 Experimental results

Filters such as Lee, Median, Weiner, Frost, guided and bilateral filters were applied on the acoustic images separately. Then results of guided filter was given as input to the bilateral filters to check the quality. But it was identified that when bilateral filter followed by guided filter was applied, the images could be denoised with edges being preserved. We compared our proposed work with the standard filters like median filter, wiener filter, lee filter, frost filter, guided filter and bilateral filter. The analysis is done by using the quality measures such as PSNR, MSE, SSIM and ENTROPY. The following Table 1 shows the results of the comparison.

Table 1 MSE, PSNR, SSIM, entropy values for the various techniques

8 Conclusion

Thus the speckle noise in the side scan sonar image can be removed by applying the bilateral filter first followed by guided filter. As the bilateral filter produces the staircase effect and gradient reversal effect it is followed by the self-guided filter which is free of these artefacts. We compared our results with other standard filters like lee filter, frost filter, median filter, wiener filter and the combination of guided filter followed by bilateral filter. It is observed that the bilateral filter followed by guided filter works better than the above mentioned filters.