Introduction

Due to its painless, radiation-free and cost-effective features, ultrasound (US) imaging techniques are being commonly used as a primary tool for diagnosis of abdominal tumours. However, its qualitative application is limited by intensity inhomogeneties, blurring (caused by US point spread function) and multiplicative speckle noise which can obscure fine anatomical details, decrease the segmentation accuracy in images and thus remains an open problem despite many years of research. Pre-processing of US data involves reducing the effect of blurring, intensity inhomogeneities and speckle. There are filters that perform denoising on US images directly like median filter, Lee filter (Lee 1981), Kuan filter (Kuan et al. 1987), SRAD filter (Yu and Acton 2002) and wavelet filter (Ruikar and Doye 2011). The other type works in the log/homomorphic domain by converting the speckle to an additive noise through logarithmic transformation and further filtering and afterwards, using exponential transformation, the restored image is obtained (Zong et al. 1998).

Identifying liver abnormalities can be a challenging problem due to low image quality and contrast between tumour regions and surrounding tissues. As such, liver abnormalities are still measured manually which is subjective and varies with the radiologist’s knowledge and experience. For diagnosis of the tumour as benign or malignant, proper segmentation should be carried out which clearly delineates the tumour boundary. The rate of survival can be improved by earlier lesion detection. Detection of isoechoic lesions or tumours that have similar echo patterns compared to the surrounding liver tissues can be challenging and thus the size determination of isoechoic masses can be complicated and in most diagnosis, it is very subjective. Since tumour echogenicities cannot be understood by pixel intensity values only, other approaches involving texture, phase and gradients should be applied for extracting the desired region of interest. Most lesion regions in ultrasound images have a high value of entropy which can be computed from texture descriptors. Compared to the surrounding liver tissue, tumours can appear hyperechoic/brighter, hypoechoic/darker and different, isoechoic/similar or anechoic/blacker than the surrounding tissues as shown in Fig. 1.

Fig. 1
figure 1

US liver tumours of varying echogenecities. a Hypoechoic. b Isoechoic. c Slightly hypoechoic. d Anechoic. e Hyperechoic

Image segmentation is a set of procedures which deals with grouping image sample points into labelled regions which can be categorized as histogram-based, watershed transformation, clustering, graph-based methods and deformable models including active contour and level set-based methods. Most of these methods require human intervention. The proposed work aims at employing an automatic segmentation of liver tumours using active contour based segmentation. For generating the initial mask or contour, the standard Otsu thresholding is adapted by constraining the threshold selection not only to the original between class variance but also phase, texture, gradient and phase gradient of the US image. By the addition of multiple constraints, tumours of varying echogenecities could be detected.

Since sufficient works on liver US tumour segmentation dealing with traditional segmentation and its variants (not including deep learning which falls beyond the scope of the proposed work) have not been reported, a review of works focussing on US breast tumours is also mentioned. Shan et al. (2012) used multi-domain features of the image and neural networks for segmentation of breast lesions. Kumar et al. (2013) proposed an automatic segmentation after a peak and valley filtering which uses statistical features to differentiate normal and abnormal regions. Yoshida (1998) implemented a method to analyse low contrast suspected regions where ID profiles along multiple radial directions were obtained which pass through the centre of the tumour. Then these profiles are processed using Sombrero’s continuous wavelets and modulus maxima lines are used for electing points on tumour boundary. The points are elliptically fitted and segmented using active contours. Poonguzhali and Ravindran (2006) proposed a method to select seed points for region growing from tumour regions based on textural features like GLCM and GLRM using gradient magnitude-based approach. Egger et al. (2017) proposed a semi-automatic tumour segmentation method using Gaussian filtering, histogram equalization, mean shift, and graph cuts and the pre-processing itself yielded the necessary seeds needed for segmentation. However, these methods fail to work for challenging tumour echogenecities, shadowed tumours and severe speckle noise.

Recently, deformable models/active contours have played a significant role in medical image segmentation since these methods are adaptable to irregular boundaries and presence of noise. Liu et al. (2018) used an adaptive thresholding for mask initialization for active contour based segmentation. The Chan-Vese segmentation was modified using ratio of exponentially weighted averages (ROEWA) for efficient segmentation. Zong et al. (2019) proposed an active contour segmentation driven by local and global intensity information. Initially, a coarse segmentation is done based on entropy features which are then refined by the proposed method. Manikandan and Farook (2014) opted active contours for carotid artery segmentation in US images. The method involves the despeckled image combined with anatomical features which are used as initialization for parametric segmentation. Lotfollahi et al. (2018) use neutrosophy along with improved weighted region-scalable active contours for segmentation of breast US images. The efficiency of active contours rely heavily on the initial energy contour and the mentioned methods do not give accurate results for isoechoic and hypoechoic tumours. Identifying these tumours is necessary for early detection and survival rate.

The proposed work employs localized region-based active contours rather than global region-based Chan-Vese active contours (Chan and Vese 2001) utilizing uniform modelling energy, considering the small variations in intensity, texture and gradient details exhibited by US images. The mask initialization is done by an adaptive thresholding derived from standard Otsu thresholding method with constraints derived from phase, texture, gradient and phase gradient. The phase gradient, which extracts the gradient information from the phase of the tumour image is a novel concept introduced in this work. To the best of our knowledge, no other works in US segmentation have employed the concept of phase and phase gradients. The proposed method of thresholding yields better results than conventional histogram based thresholding (Otsu 1978) and entropy based thresholding (Kapur et al. 1985) and its modifications (dos Anjos and Shahbazkia, 2008, Wu et al. 1998). The localized version of the uniform modelling energy based active contours by Lankton and Tannenbaum (2008) was adopted for the proposed segmentation. Instead of manually specifying the mask as in most literature, proposed thresholding and morphological operations were used for mask initialization. Posterior acoustic shadowing (PAS) is an US image artefact observed for highly malignant tumours, where the tumour tissues produce a certain shadow effect on the neighbouring regions. PAS has been solved using learning strategies like Adaboost in literature (Madabhushi et al. 2006) and watershed segmentation methods (Gomez et al., 2010). However, global features were considered rather than localized features which can result in poor detection for highly shadowed tumours. Also, existing thresholding methods incorrectly label the shadow region as the tumour too due to insufficient intensity variance between the tumour and shadow. In the proposed method, PAS problem is refined since the shadows differ in texture, phase and gradients even though they may have the same intensities as the neighbouring tumour.

To summarize, active contour segmentation utilizing localized energy definitions have been implemented for delineating US tumours irrespective of their echogenicity. It is well known that the efficiency of active contour segmentation depends on the initial mask and hence an adaptive thresholding method with image feature dependent constraints have been used. Prior to thresholding, adequate pre-processing is done by using neutrosophic denoising and enhancement along with shearlet based despeckling. No other works have implemented combined neutrosophic enhancement and shearlet despeckling as US denoising mechanisms. It is then the pre-processed image is suitably thresholded as proposed. The subsequent sections highlight the materials chosen, methods adopted in implementing the proposed work, the results obtained and finally, the concluding remarks which also suggest the future scope of the work. The proposed work and its comparisons with other standards are also presented.

Methods

Pre-processing: neutrosophic filtering and shearlet based enhancement

Neutrosophy introduced by Florentin Smarandache studies neutralities and indeterminacies and considers a concept A in relation to its opposite, Anti-A and that which is not A, Non-A, and that which is neither A nor Anti-A, denoted by Neut-A. Transformation of an image into neutrosophic domain is to separate it into three layers or three images namely foreground objects, boundary objects and background (Salama and Smarandache, 2014). Due to the fuzziness or randomness exhibited by US images, neutrosophic based processing can yield significant results. “A neutrosophic set A in X (a space of points or objects with an element denote as x)given as A = {(x, T(x), I(x), F(x))|xX} is characterized by a truth-membership function T, an indeterminacy-membership function I and a falsity membership function F (tT (x),iI(x) and fF (x)vary in]-0, 1 + [)”. The pixel g(i, j) within an image window of size w in the Cartesian image domain is transformed into neutrosophic domain as g(i, j) = {T (i, j), I (i, j), F (i, j)} (Salama and Smarandache 2014) where,

$$ T\left(i,j\right)=\frac{\overline{g}\left(i,j\right)-{\overline{g}}_{\mathrm{min}}}{{\overline{g}}_{\mathrm{max}}-{\overline{g}}_{\mathrm{min}}} $$
(1)
$$ I\left(i,j\right)=\frac{\delta \left(i,j\right)-{\delta}_{\mathrm{min}}}{\delta_{\mathrm{max}}-{\delta}_{\mathrm{min}}} $$
(2)
$$ F\left(i,j\right)=1-T\left(i,j\right) $$
(3)

and

$$ \overline{g}\left(i,j\right)=\frac{1}{w\ast w}\sum \limits_{m=i-w/2}^{i+w/2}\sum \limits_{n=j-w/2}^{j+w/2}g\left(m,n\right) $$
(4)
$$ {\overline{g}}_{\mathrm{max}}=\max \left(\overline{g}\left(i,j\right)\right). $$
(5)
$$ {\overline{g}}_{\mathrm{min}}=\min \left(\overline{g}\left(i,j\right)\right) $$
(6)
$$ \delta \left(i,j\right)= abs\left(g\left(i,j\right)-\overline{g}\left(i,j\right)\right) $$
(7)
$$ {\delta}_{\mathrm{max}}=\max \left(\delta \left(i,j\right)\right) $$
(8)
$$ {\delta}_{\mathrm{min}}=\min \left(\delta \left(i,j\right)\right) $$
(9)

The Cartesian image is thus converted to a 3D matrix in the neutrosophic domain. The conversion is summarized as:

  1. 1.

    Read the image.

  2. 2.

    For each pixel, compute its local mean intensity (\( \overline{g}\left(i,j\right) \)) and its maximum (\( {\overline{g}}_{\mathrm{max}} \)) and minimum (\( {\overline{g}}_{\mathrm{min}} \)) values (Eqs. 4–6).

  3. 3.

    Compute absolute divergence between each pixel intensity and its local mean intensity (δ(i,j)) and its maximum (δmax) and minimum (δmin) values (Eqs. 79)

  4. 4.

    Construct the truth matrix T, indeterminate matrix I and the falseness matrix F (Eqs. 1–3)

After the image has been converted to the neutrosophic domain, enhancement is performed on each matrix individually (Salama and Smarandache 2018). It is a twofold process enhancement, i.e. denoising and contrast improvement. To do so, firstly, a power law transformation (gamma correction) is performed on the truth matrix. Secondly, shearlet denoising and Gaussian filtering are used on the indeterminate matrix to remove multiplicative and additive noise and enhance the edges. Thirdly, the falseness matrix is subjected to a logarithmic transformation to enhance its corresponding details.

Shearlets are a multiscale framework which allows encoding anisotropic features than compared with wavelets and preserves significant detail information in despeckled outputs (Easley et al. 2009) and can effectively capture the geometry of images. The discrete shearlets can be denoted as:

$$ SH\left(\psi \right)=\left\{{\psi}_{j,k,m}={\left|\det A\right|}^{j/2}\psi \left({S}^k{A}^j-m\right):j,k\in Z,m\in {Z}^2\right\} $$
(10)

where j, k and m are scale, direction and shift parameters, respectively. \( {A}_a=\left(\begin{array}{cc}a& 0\\ {}0& \sqrt{a}\end{array}\right) \) is the anisotropy matrix for multi-scale partitions/resolution change, \( {S}_s=\left(\begin{array}{cc}1& s\\ {}0& 1\end{array}\right) \) is the shear matrix for directional analysis/orientation change. a and s are the scale and orientation variables, respectively.

Discrete shearlet transform can be denoted as:

$$ {SH}_{\psi }f\left(j,k,m\right)=\left\langle f,{\psi}_{j,k,m}\right\rangle $$
(11)

and its inverse can be given as:

$$ f=\sum \limits_{j,k\in Z,m\in {Z}^2}\left\langle f,{\psi}_{j,k,m}\right\rangle {\psi}_{j,k,m} $$
(12)

Discrete shearlet transform (Easley et al. 2008) is a combination of Laplacian pyramid (LP) and directional filter and non-subsampled shearlet transform (NSST) (Da Cunha et al., 2006), most commonly used for denoising applications used non-subsampled LP (NSLP) (Abazari and Lakestani 2019) analysis represented as \( {NSLP}_{j+1}f=\left({Ah_j}^1\prod \limits_{k=1}^{j-1}{Ah_k}^0\right)f \) rather than LP which can improve its effectiveness where f is the image, NSLPj + 1 is the detail coefficients at scale j + 1, and Ahk0 and Ahj1 are low pass and high pass filters of NSLP at scale j and k respectively.

NSST algorithm for a resolution scale j and number of directions Dj (Abazari and Lakestani 2019) can be summarized as:

  1. 1.

    Decompose f into a low pass image faj and a high pass image fdj using NSLP

  2. 2.

    \( {{\overset{\frown }{f}}_d}^j \)is computed in pseudo polar grid, to get Pfdj

  3. 3.

    Band-Pass filtering is applied to Pfdj to obtain \( {{\left\{{{\overset{\frown }{f}}^j}_{d,k}\right\}}^{D_j}}_{k=1} \)

  4. 4.

    Inverse FFT is applied in pseudo polar grid to obtain NSST coefficients {f j d,k}Dj k = 1

For this work, denoising the indeterminate image after necessary scaling using log transform method for multiplicative noise (speckle) removal similar to (Porat and Zeevi 1989) was considered. The aim is to obtain an estimated version of the indeterminate image \( \hat{I} \), containing mainly only the edges from an image I which contains both the noise and edge by applying soft thresholding on NSST coefficients of I. The threshold levels are given by \( {\tau}_{j,k}={c}_j{\sigma}_{\gamma_{j,k}} \) where standard deviation of noise at scale j and shear directional band k are given as \( {\sigma}_{\gamma_{j,k}} \) and scaling parameter is given as cj. Five levels of the NSLP decomposition are performed where eight 32 × 32 shear filters are used for the first two coarser scales and sixteen 16 × 16 shear filters are used for the third and fourth finer scales and so on. The denoised image is converted back using exponential transformation.

The neutrosophic image entropy is defined as the summation of the entropies of the three sets. It is used to quantify the degree of indeterminacy in images. Strength of the correlation between truth and false sets with indeterminate set is influenced by the distribution of the pixels and the entropy of the indeterminate set. The enhancement is iterated till the entropy associated with indeterminate set is less than a predefined threshold. Finally, the enhanced truth image is converted back to the Cartesian image. A pixel in Cartesian domain is given as:

$$ f\left(i,j\right)={f}_{\mathrm{min}}+\left({f}_{\mathrm{max}}-{f}_{\mathrm{min}}\right)\overline{T}\left(i,j\right) $$
(13)

where fmin and fmax are the minimum and maximum values in the denoised and enhanced set \( \overline{T} \). The neutrosophic components of a US image and the final pre-processed image as obtained during simulation are shown in Fig. 2. The denoised enhanced image is then passed through Gabor filters for extraction of textural features which is used as one of the constraining parameters of the proposed thresholding algorithm.

Fig. 2
figure 2

a Original US image. b Truth (T) image. c Indeterminate (I) image showing speckles present in the US image. d False (F) image. e Denoised US image

Gabor filter for texture extraction

Gabor filters are extensively used in areas of texture based segmentation, feature extraction and classification (Porat and Zeevi 1989, Haralick 1979, Weldon et al. 1996, Mirzapour and Ghassemian 2015). A Gabor function can be defined as a sinusoidal modulated Gaussian with a spread of σx and σy in the x and y directions and a modulating frequency of u0. Its impulse response is given as:

$$ h\left(x,y\right)=\frac{1}{2{\pi \sigma}_x{\sigma}_y}\exp \left\{-\frac{1}{2}\left[\frac{x^2}{{\sigma_x}^2}+\frac{y^2}{{\sigma_y}^2}\right]\right\}\cos \left(2\pi {u}_0x\right) $$
(14)

Equation 14 shows an orientation of 0° w.r.t x-axis. Arbitrary rotations of the filter can be obtained by rotating the function. The frequency u0 and the rotation angle θ define the filter centre location. By tuning the two values to different centre locations, multiple filters can be created that span the entire domain.

Gabor wavelets can be utilized to extract texture features where a set of wavelets are generated using a mother wavelet (Eq. 15) and the whole image is given as input to the wavelet set.

$$ \phi \left(x,y\right)=\frac{1}{2{\pi \sigma}_x{\sigma}_y}\left\{\frac{-1}{2}\left(\frac{x^2}{{\sigma_x}^2}+\frac{y^2}{{\sigma_y}^2}\right)+2\pi {jU}_hx\right\} $$
(15)
$$ h\left(x,y\right)={\left(\frac{U_h}{U_l}\right)}^{\frac{-s}{s-1}}\phi \left(X,Y\right) $$
(16)
$$ X={\left(\frac{U_h}{U_l}\right)}^{\frac{-s}{s-1}}\Big[\left(x-{x}_0\right)\cos \left(\frac{d\pi}{N_d}+\left(y-{y}_0\right)\sin \left(\frac{d\pi}{N_d}\right)\right) $$
(17)
$$ Y={\left(\frac{U_h}{U_l}\right)}^{\frac{-s}{s-1}}\Big[-\left(x-{x}_0\right)\sin \left(\frac{d\pi}{N_d}+\left(y-{y}_0\right)\cos \left(\frac{d\pi}{N_d}\right)\right) $$
(18)

A filter set is characterized by the choice of a proper set of frequencies and orientations that cover the domain and capture texture information as much as possible. Here, s = 1, 2…Ns is scaling parameter and d = 1, 2,…Nd is direction parameter of wavelets, (x0,y0) is filter centre in spatial domain, Uh and Ul are maximum and minimum values of centre frequencies. In the proposed work, 4 scales each with 8 different orientations were chosen to capture as much echogenic texture as possible. An US liver tumour along with its pre-processed version and Gabor texture filtered version as obtained during simulation is shown in Fig. 3.

Fig. 3
figure 3

a Original. b Enhanced and denoised truth. c Gabor filtered images

Proposed adaptive Otsu thresholding and contour intialization

The widely used Otsu thresholding (Otsu 1978) is an automatic optimal global thresholding used for its effectiveness, simplicity and low computational complexity and is based on features obtained from intensity histograms. Since US images have poor resolution and contrast, Otsu thresholding based on intensity histograms alone will not yield satisfactory results for tumour region thresholding, despite contrast adjustment operations. Experimentally, the Otsu method could not distinguish between tumour region and certain surrounding tissues of similar intensities. In the proposed methodology for tumour segmentation using active contours, the initial contour is obtained using an adaptive Otsu based thresholding similar to our previous work (Sivanandan and Jayakumari 2020). A brief summary of the Otsu method based on intensity histograms is explained. For an image of L grey levels, ni denotes the number of pixels with grey value i and total number of pixels \( N=\sum \limits_{i=0}^{L-1}{n}_i \). The probability density distribution of i is given as \( {p}_i=\frac{n_i}{N} \)and 0 ≤ pi ≤ 1. The image pixels can be divided into two classes F (foreground) and B (background) by specifying a threshold t. Then, F specifies the pixels within [0, 1,…, t] and B represents the pixels [t + 1, t + 2,…, L − 1]. The mean grey level μT, the between-classes variance σ2 and the optimal threshold topt is given as:

$$ {\mu}_T=\sum \limits_{i=0}^{L-1}{ip}_i $$
(19)
$$ {\sigma}^2={\varpi}_0{\left({\mu}_0-{\mu}_T\right)}^2+{\varpi}_1{\left({\mu}_1-{\mu}_T\right)}^2 $$
(20)
$$ {t}_{opt}=\arg \max {\sigma}^2 $$
(21)

where ϖ 0 and ϖ 1 are the probability of the two classes and μ0 and μ1 are its corresponding averages. Multiple thresholds are obtained using Otsu thresholding on texture, phase, gradient and phase gradient operations performed on the enhanced despeckled image. Frequency domain operations offer an alternative approach in image enhancement and denoising where the image is represented in terms of its magnitude and phase spectra. In such cases, the phase of an image contains important information regarding the edges and avoiding the phase during FFT reconstruction will result in significant information loss. Although not essentially the same, the concept of phase gradients is used in (Hoffman Modulation Contrast) microscopy to enhance or highlight transparent/translucent details that are embedded within the cell (Sarafis 2013). This theory was extended to detect similar tumour echogenecities to that of surrounding tissues, and thus refine the delineation of our region of interest. The steps can be briefed as:

  1. 1.

    Compute the simple Otsu global threshold, say thresh1.

  2. 2.

    Compute the global threshold based on its Gabor texture image, say thresh2.

  3. 3.

    Transform the image to its Fourier domain and then reconstruct the image using phase information alone. Compute the global threshold of this reconstructed phase image, say thresh3. Also compute the threshold of the gradient of phase reconstructed image (phase gradient), say thresh4.

  4. 4.

    Label the image as foreground pixels only when the image pixels are less than the minimum of thresh1, thresh2, thresh3 and thresh4. Rest of the pixels is labelled as background.

To reduce the effect of any mislabelled regions, morphologic operations are performed using disc-shaped structuring element (ranging from 4 to 10 pixel radius depending on the tumour). Further, a hole filling operation is done to fill any holes inside the region of interest and label the connected components. The thresholded output is used for the purpose of mask creation and not exact tumour boundary delineation. As such the change in shape or size of the tumour after morphological operations is not given much importance.

To reduce the number of unwanted objects borders of the thresholded image was deleted as in (Liu et al. 2018). Assume a window half the size of the actual image centred at image centre. If any border region does not intersect with the window centre, it is deleted. If any border region intersects with the centre of the window, it is then truncated by the window and the process is thus repeated. Usually, the largest component will correspond to the tumour and choosing the border of this largest component will give the initial contour/mask. In certain US echogenecities, where the tumour edges are not visible, proper thresholding cannot always be guaranteed. The location of the tumour can be obtained by the said thresholding, but might not be of accurate shape. In these cases, after choosing the largest component, the initial contour can be chosen as an appropriate circle centred at the component centre. Even though the proposed thresholding did yield satisfying results as tumour boundary delineation, further post-segmentation is required to extract the accurate boundary close to the decision as that of the radiologist.

Active contour segmentation using localized energies

Active contours are curves or surfaces that move under the effect of internal forces or pressure defined within the curve and external forces computed from image data. The internal forces ensure smoothness during contour deformation and external forces move the curve towards/away from the boundaries or other features of interest and segmentation based on this concept exhibits higher accuracy than classical segmentation procedures. In this paper, localized region-based active contours as proposed by (Lankton and Tannenbaum 2008) is employed and the methodology is briefly summarized as given.

For a closed contour C of an image I in domain Ω represented as zero level set of signed distance function, its interior is specified by the smoothened Heaviside approximation as:

$$ H\varPhi (x)=\left\{\begin{array}{c}1\kern10em \varPhi (x)<-\varepsilon \\ {}0\kern10em \varPhi (x)>+\varepsilon \\ {}\frac{1}{2}\left\{1+\frac{\varPhi }{\varepsilon }+\frac{1}{\pi}\sin \left(\frac{\pi \varPhi (x)}{\varepsilon}\right)\right\}\kern0.5em \mathrm{otherwise}\kern3.5em \end{array}\right. $$
(22)

The exterior is given as 1 − (x). The area just around C is given as the derivative of Eq. (12), a smoothened Dirac delta function:

$$ \delta \varPhi (x)=\left\{\begin{array}{c}1\kern10em \varPhi (x)=I\\ {}\kern0.75em 0\kern10em \left|\varPhi (x)\right|<\varepsilon \\ {}\frac{1}{2\varepsilon}\left\{1+\cos \left(\frac{\pi \varPhi (x)}{\varepsilon}\right)\right\}\kern0.75em \mathrm{otherwise}\kern3.5em \end{array}\right. $$
(23)

A signed distance function is (SDF, ϕ) calculated from an initial mask using Euclidean distance. For a contour C, its representation is given by the zero level set of the SDF i.e. C = ((x),ϕ(x) = 0).

The characteristic function in terms of x and y (each representing a unique point) is used to define the mask region and equals 1 when the point y is within a ball of radius r centred at x and 0 elsewhere is defined as:

$$ B\left(x,y\right)=\left\{\begin{array}{c}1\kern1.5em \left|\left|x-y\right|\right|<r\\ {}0\kern2.25em \mathrm{otherwise}\end{array}\right. $$
(24)

The energy function in terms of the characteristic function B, and a generic internal energy representing adherence to the model at each point of the contour, F is defined as:

$$ E\left(\varphi \right)=\underset{\Omega_x}{\int } y\varphi (x)\underset{\Omega_y}{\int }B\left(x,y\right).F\left(I(y),\varphi (y)\right) dydx $$
(25)

To compute the energy, points near C are only considered and inhomogeneities far away are ignored thus capturing a wider range. This is ensured by multiplication with the Dirac function in the outer integral. This also ensures that the curve topology will not change suddenly but still allows for merging or splitting. A regularization parameter, λ is added to ensure the smoothness of the curve and the final energy and evolution equation is given as:

$$ {\displaystyle \begin{array}{l}E\left(\varphi \right)=\underset{\Omega_x}{\int}\delta \varphi (x)\underset{\Omega_y}{\int }B\left(x,y\right).F\left(I(y),\varphi (y)\right) dydx+\lambda \underset{\Omega_x}{\int}\delta \varphi (x)\left\Vert \nabla \varphi (x)\right\Vert dx\\ {}\end{array}} $$
(26)
$$ {\displaystyle \begin{array}{l}\frac{\partial \varphi (x)}{\partial t}=\delta \varphi (x)\underset{\Omega_y}{\int }B\left(x,y\right).{\nabla}_{\varphi (y)}F\left(I(y),\varphi (y)\right) dy+\lambda .\delta \varphi (x)\mathit{\operatorname{div}}\left(\frac{\nabla \varphi (x)}{\left|\nabla \varphi (x)\right|}\right)\\ {}\end{array}} $$
(27)

The global mean intensities of the interior and exterior regions (u and v)and their localized versions (ux and vx) which are needed to determine the local energies are given as:

$$ u=\frac{\underset{\Omega_y}{\int } H\varphi (y)I(y) dy}{\underset{\Omega_y}{\int } H\varphi (y) dy} $$
(28)
$$ v=\frac{\underset{\Omega_y}{\int}\left(1- H\varphi (y)\right)I(y) dy}{\underset{\Omega_y}{\int}\left(1- H\varphi (y)\right) dy} $$
(29)
$$ {u}_x=\frac{\underset{\Omega_y}{\int }B\left(x,y\right) H\varphi (y)I(y) dy}{\underset{\Omega_y}{\int }B\left(x,y\right) H\varphi (y) dy} $$
(30)
$$ {v}_x=\frac{\underset{\Omega_y}{\int }B\left(x,y\right)\left(1- H\varphi (y)\right)I(y) dy}{\underset{\Omega_y}{\int }B\left(x,y\right)\left(1- H\varphi (y)\right) dy} $$
(31)

Chan-Vese method (Chan and Vese 2001) uses constant intensity model energy represented by u and v called as uniform modelling energy. However, this energy was modified accordingly using the localized versions of the means ux and vx to obtain the localized energy (Lankton and Tannenbaum 2008). The Chan-Vese energy is given as:

$$ {E}_{UM}=\underset{\Omega_y}{\int } H\varphi (y){\left(I(y)-u\right)}^2+\left(1- H\varphi (y)\right){\left(I(y)-v\right)}^2 dy $$
(32)

The internal energy measure using the localized means is given as:

$$ {F}_{UM}= H\varphi (y){\left(I(y)-{u}_x\right)}^2+\left(1- H\varphi (y)\right){\left(I(y)-{v}_x\right)}^2 $$
(33)

The minimum energy is obtained when every point C has moved and the local interior and exterior for each point is best approximated by ux and vx. This method of localized energy based active contour segmentation was used in the proposed work where the initial contour is as obtained from the adaptive thresholding as aforementioned. This method performs well in cases of low contrast echogenecities and in images exhibiting posterior acoustic shadowing.

Proposed methodology for ultrasound tumour segmentation

The proposed work focuses on segmentation of liver tumours using active contours with the initial mask derived from the abovementioned adaptive Otsu thresholding. The work was applied on 40 B-mode US liver tumour images (18 hypoechoic, 8 isoechoic, 8 anechoic and 6 hyperechoic tumours each) from www.ultrasoundcases.info. The methodology was implemented on MATLAB R2017b platform and on a system having a clock speed of 2 GHz and 4 GB memory. The block diagram representation of the proposed methodology is shown in Fig. 4.

Fig. 4
figure 4

Block diagram of the proposed methodology for ultrasound tumour segmentation

The methodology for the proposed segmentation method is as follows:

  1. 1.

    Perform the neutrosphic enhancement on the US image where the edges and noise is pre-processed using NSST.

  2. 2.

    Threshold the image according to the proposed adaptive thresholding using constraints derived from texture, phase and phase gradient.

  3. 3.

    Process the thresholded output using morphological operations (if necessary) and delete any borders. This step results in the creation of a mask/initial contour.

  4. 4.

    Segment the image using localized energy based active contours using the result of the previous step as the initial contour.

Results

The methodology was applied on tumours of varying echogenecities including tumours with complicated boundaries and shadows. Figure 5 shows the comparison of the suggested adaptive thresholding with simple Otsu thresholding and entropy thresholding and shows significant tumour localization in the proposed method based on multiple features. To highlight the improvement of the said thresholding, visual results are shown for cases of complicated echoes only.

Fig. 5
figure 5

a Original image. b Proposed thresholding. c Otsu thresholding. d Entropy thresholding

The segmented outputs are shown in the binary form to understand the accuracy of delineation of tumour region. Most works in literature focus on centred tumours and tumours having well-defined boundaries while the current work focused on non-centred tumours as well. Figure 6 demonstrates the efficacy of the proposed adaptive thresholding with localized energy based active contour segmentation when compared with global energy based Chan-Vese active contour segmentation. The ground truth was provided by an experienced radiologist who had highlighted the tumour region of interest.

Fig. 6
figure 6

a Original image. b Ground truth. c Chan-Vese active contour segmentation using proposed thresholding. d Localized energy based active contour segmentation using proposed thresholding

The methodology was implemented on challenging US tumour images like slightly hypoechoic tumours and US images with posterior acoustic shadows and the results are better for localized energy based segmentation which are shown in Fig. 7. For easier interpretation, the segmentation binary masks are not shown.

Fig.7
figure 7

a Original tumour image. b Proposed method of segmentation. c Chan-Vese segmentation

The quantitative performance of the proposed method was compared with other segmentation techniques and the average values are shown in Table 1. The proposed thresholding using phase gradients were compared with Otsu thresholding (Otsu 1978) and entropy thresholding (Kapur et al. 1985). The active contour segmentation methods used for comparison were the Chan-Vese (Chan and Vese 2001) method and the localized energy method (Lankton and Tannenbaum 2008).

Table 1 Performance comparison showing average values with standard deviations of local energy based active contour and Chan-Vese segmentation using entropy thresholding, Otsu thresholding and proposed thresholding for contour initialization. SSIM structural similarity, MSE mean square error, JI Jaccard Index, DC Dice coefficient, SD standard deviation

The performance of the proposed adaptive thresholding was already demonstrated in our previous work (Sivanandan and Jayakumari 2020) on ultrasound tumours of different echogenecities. Figure 8 demonstrates that the combination of the proposed adaptive thresholding with localized energy based active contour segmentation performs better than simple Otsu thresholding with localized energy based active contour segmentation on the basis of parameters SSIM, MSE, JI, DC and accuracy. Since histogram thresholding gave better results than entropy thresholding, the former was used for comparing against the proposed thresholding. For this, 40 ultrasound liver tumours (benign and malignant) of different echogenecities were considered, with the true tumour boundary delineations provided by the radiologist. Except in the case of highly isoechoic tumours, the proposed methodology gave the best results when compared with other thresholding/segmentation methods.

Fig. 8
figure 8

Performance of the localized energy based active contour segmentation using Otsu thresholding and proposed thresholding for contour initialization on the basis of a SSIM, b MSE, c JI, d DC and e accuracy

Discussion

The localized energy based active contour segmentation using the proposed thresholding to generate the initial contour yields better segmentation than Chan-Vese region based active contour segmentation. On comparing with Liu et al. (2018), where the authors had used ratio of exponentially weighted average based modified Chan-Vese active contours, better segmentation was obtained for isoechoic and hypoechoic tumours using the proposed method since local energies were targeted during contour deformation rather than global average energies. The segmentation accuracy for the proposed work was increased by 35% for hypoechoic tumours and by 52% for challenging isoechoic tumours than the former. For most tumours with sufficient contrast compared with surrounding tissues, Chan-Vese segmentation yields satisfactory results. The difference in segmentation between the two methods is significant for tumours having insufficient contrast or has similar intensities to the surrounding tissues.

The proposed methodology was also compared with active contours using gradient vector flows (Cvancarova et al. 2005) implemented on the current dataset and had obtained similar results. The average segmentation accuracy difference between the two methods using the proposed thresholding was approximately 1.4% with the proposed method giving better values even for hypoechoic and isoechoic tumours. As an alternative to active contour based segmentation, unsupervised clustering based segmentation was also studied and implemented on the current dataset, as proposed by Sajith and Hariharan (2015) and Das and Sabut (2016). It was seen that fuzzy c-means based clustering failed to give accurate delineations for challenging tumours with the best average accuracy being 79.5% only.

For pre-processing, it was found that employing entropy features on the current dataset, as in Zong et al. (2019) will not work efficiently in the presence of severe speckle. The proposed work on the other hand uses neutrosophic set concept and shearlets to denoise and enhance the image and gave better localization of the tumour. The method adopted by Lotfollahi et al. (2018) is computationally costlier on this dataset than the proposed method by approximately 15%.

It is necessary to discuss the purpose of using traditional segmentation algorithms like active contours or thresholding in an era of deep learning based semantic segmentations. One must note that the performance of any deep learning algorithm depends hugely on the availability of proper training and validation datasets. Larger the dataset and the variance of images in them, better the network trains itself to most optimal weights and biases. Ultrasound image databases are limited and most medical centres do not allow the availability of such datasets online. Due to this unavailability of sufficient data, there is always an issue of trusting a deep learned model to accurately predict an unknown query image and remains as a challenge and an active area of research in deep learning (Liu et al. 2019, Brattain et al. 2018). It is in such a scenario, option for traditional segmentation algorithms where there is a certain level of manual intervention or supervision is possible would be more prudent.

Conclusion

US images are corrupted by noise and system artefacts which cause difficulties during feature or region of interest extraction. The proposed work focuses on extracting tumours as region of interest from US liver tumour images. Tumour segmentation involves two steps, the first being region of interest identification and the second being tumour segmentation. The region identification employs pattern recognition techniques with thresholding being the most commonly used to create an initial tumour region mask. Common segmentation methods based on an initial mask are active contours (commonly called snakes) which may fail to extract the boundaries accurately if the initial mask is improperly defined. The proposed adaptive Otsu thresholding based on multiple predicates using texture, phase, gradient and phase gradient yields better initial contours as compared to other works in literature which uses manually predefined masks.

As a pre-processing step and different to other existing works, the US image is neutrosophically enhanced and denoised using shearlets taking into effect the fuzzy nature of US images. Since US images suffer from intensity inhomogeneties, the traditional Chan-Vese segmentation yields unsatisfactory segmentation which is resolved by using localized energy based active contour segmentation considering the localized means in the contour and its neighbourhood. The segmentation results show significant improvement over the existing optimal global thresholding and include additional features like less morphological operations to refine the thresholded output for mask creation and working even in cases of images with PAS and iso/slightly hypoechoic tumours. The proposed method guarantees to help in diagnostic procedures where the radiologist can decide the region of interest with minimum processing. In future, expansion of the work in identifying the tumour as benign or malignant by implementing automatic classification methods using features from the detected/segmented tumour is suggested. Further refinements may be done to the proposed thresholding to obtain still better masks.