1 Introduction

In computer vision systems, analysing a natural scene with large scale structures and small scale details is a challenging problem. Human visual system can easily handle these problems without image abstraction (small detail elimination). Image salient features are the basic data for human perception, rather than small scale details/textures [3]. In the real world, large structures, important edges, and textures are the principal features that built up the scene. Therefore, segmentation of the overall structure from the highly correlated background is an essential task in computer vision.

Edge-aware filters are essential tools in image decompositions. The key idea of using an edge-aware filter is to decompose an image into two components, large scale structures and small scale details. Existing filters, such as bilateral filter [34], guided filter [13], l0-smoothing filter [37], BMA filters [8], domain transform filter [11], and semi-guided bilateral filter [33] tend to remove small details while preserving prominent structures. However, they are not explicitly designed to handle the structure extraction problem. Applying an edge-aware filter on a texture image may lead to texture preservation and undesirable results may arise. This is because some texture components are considered as distinctive edges by these filters.

To address this problem, researchers have studied and developed structure extraction techniques to handle the texture smoothing task. In the following, we only list some well-known algorithms in the structure-preserving and texture smoothing. They include: local extrema filter (LE) [32], rolling guidance filter (RGF) [40], and relative total variation (RTV) [38]. In the past few years, a considerable amount of works have been published in structure extraction filtering which include: region covariances filter (RCF) [16], bilateral texture filter (BTF) [7], Laplacian texture filter (LTF) [9], scale-aware texture smoothing (SATS) [14], zero crossing structure decomposition filter (ZCSD) [15] and reductive regression filter (RRF) [31]. In addition to structure preservation, these filters are widely used in many applications, such as seam carving [16], hatching and image abstraction [41], image seamless cloning and image vectorization [38], image detail enhancement [2], JPEG artifact removal [9], and image inpainting [36]. All aforementioned structure preservation filters can also be used in a very important application which is visual tracking, we refer interested readers to papers [18,19,20,21,22].

Inspired by the recent success of these filters and their valuable applications, the aim of this work is to explore a new structure-texture decomposition filter. The basic idea of this work is based on the classical anisotropic diffusion algorithm [27]. The proposed method is implemented in two main steps. In the first step, a pre-smoothing is utilized to mitigate the high contrast texture information. In the second step, the resultant image from the first step is then processed by the new anisotropic filter. The key difference between this work and the work based on the classical anisotropic diffusion is that in this work a new directional second neighbour derivative (SND) operator is used to obtain the image gradient, while in the classical anisotropic diffusion the first neighbour derivative operator (FND) is used in the gradient calculation. Since the SND operator is less sensitive to the texture information than the FND operator, the proposed filter produces better results in terms of suppressing high-frequency oscillation while preserving distinctive structures.

The organization of this paper is summarized as follows. In Section 2, we present a review of related texture smoothing algorithms. In Section 3, we provide the background information about anisotropic diffusion as well as definitions of the first and second neighbour derivative operators. In Section 4, we present the proposed texture smoothing method, including the selection of pre-smoothing filter, and a discussion of the optimal selection of parameters from total variation point of view. In Section 5, we present experimental results and comparisons with state-of-the-art approaches in a wide range of applications including: texture smoothing, texture transfer, contrast enhancement, superpixel segmentation and pencil sketch. Conclusions are given in Section 6.

2 Related work

Many image decomposition techniques have been proposed to separate an image into a piecewise constant smooth base layer and a detail layer. In the following, we briefly review previous works in the area of structure extraction smoothing relating to the proposed filter.

Edge-preserving smoothing aims at removing fine-scale details and textures in an image without blurring large-scale object boundaries and sharp edges. The bilateral filter [34] is broadly utilized in noise removal and detail smoothing while preserving significant edges. Petschnigg et al. and Kopf et al. proposed an extension of the bilateral filter called joint/cross bilateral filter [17, 28]. In this filter, the range kernel distance is computed based on a guidance image to guide the smoothing process. The guided filter [13] is a local neighbourhood filter which is formulated based on a local linear model. Similar to the bilateral filter, the guided filter smooths an image by considering the content of the guidance image. It has a good edge preserving properties and it is more computationally efficient compared to the bilateral filter. Moreover, the guided filter produces images free of gradient reversal artifacts that the bilateral filter suffers from in detail enhancement applications. Zhang et al. [40] exploited the scale-aware theory to initially produce a smoothed image by eliminating small textures. It can be employed as a guidance image, then applied iteratively using the joint bilateral filter to recover the large edges from the input image. Unfortunately, their results may contain blocking artifacts due to coarse smoothing of the guidance image.

In addition to the local average-based filtering, there are several global optimization-based approaches, such as total variation (TV) [29], weighted least square (WLS) [10], and l0-gradient minimization [37]. The total variation (TV) [29] is an edge-preserving regularization framework which has proven to produce good result on smoothing out the texture of irregular shapes by enforcing TV regularization constrains to retain the salient objects edges. The original formulation was extended by some studies reported in [4, 6, 39] which use different norms for both data fidelity and regularization term. They demonstrated that using robust norm could improve image structure-texture separation. Farbman et al. [10] introduced a weighted least square algorithm to overcome problems such as halo artifacts which appears due to coarse smoothing. The proposed global optimization function controls the level of smoothing for the image to be smoothed everywhere except at areas having distinctive gradient values. Xu et al. [37] proposed a robust smoothing filter algorithm which utilizes a sparse gradient measure. The regularization term of l0-smoothing is designed to consider the number of image pixels having non-zero gradient as a regularization constrains. Consequently, the proposed approach can remove small gradient magnitudes of image details while retaining and even enhancing the distinctive objects in an image. All the aforementioned edge-aware filters depend on image intensities difference or the gradient magnitudes. Thus, the ongoing smoothing process generally can not recognize texture/small-scale details from the salient image features. As such, unsatisfactory structure-texture separation results might appear. Another limitation of the edge-preserving filter is that they are not explicitly design to handle the existence of high-frequency oscillatory components in an image. Figure 1 shows examples of using edge-preserving filters in texture-structure smoothing.

Fig. 1
figure 1

A comparison on using different global optimization-based edge-preserving filters in texture smoothing. a Input image. b TV [29] (λ = 0.4) c WLS [10] (λ = 2,α = 3) d L0-smoothing [37] (λ = 0.1,κ = 2)

Due to limitations of edge-aware smoothing algorithms, many researchers have studied and developed algorithms which have good ability to identify meaningful structures from textures in an image. Xu et al. [38] employed a non-linear relative total variation (RTV) metric which allows accurate identification of textures and smooths them out. They demonstrated that adopting such spatially varying metric can yield good structure-texture separation results. Karacan et al. [16] introduced a patch-based approach to decompose an image into large structural and small textural components. Region covariance descriptors which encode intensity and orientation are used as the base to compare the similarity between two patches on a Riemannian manifold. However, it may produce over smoothed results due to the overlapping patches. It is also computationally expensive. Cho et al. [7] introduced a patch shift mechanism into the joint bilateral filter to achieve structure-texture decomposition. They also proposed an improved measure of the relative total variation called the modified relative total variation to enhance the filtering performance. The joint bilateral filter is employed by incorporating results from patch shift in the range kernel to obtain the smoothed output.

3 Anisotropic diffusion and the SND operator

In this section, we first give a brief description of the anisotropic diffusion filter. Definitions of the first neighbour derivative (FND) and the second neighbour derivative (SND) are also provided. Then, a justification for using the SND operator in our structure extraction filter is presented. Finally, we generalize the 1-D SND operator to the 2-D SND operator.

3.1 Anisotropic diffusion

Anisotropic diffusion is a non-linear diffusion technique used in image denoising via partial differential equation. Perona and Malik modified the classical diffusion filter to avoid the edge blurring problem in conventional diffusion filter by introducing stopping functions to stop the diffusion across significant boundaries [27]. The discrete version of the anisotropic diffusion equation is given by

$$ I^{k + 1}{(x,y)}=I^{k} {(x,y)}+{\lambda}\sum\limits_{d = 1}^{8} S(|\nabla_{d} I^{k}({x,y})|) \nabla_{d} I^{k}({x,y}) $$
(1)

where Ik(x,y) represents an image, k denotes iterations (discrete time steps), and x, y denote the coordinates of the pixel to be processed. The constant λ∈(0,1/4] is a value that determines the diffusion rate. ∇dIk(x,y) is the derivative of an image I in dth direction, d = 1...8 corresponds to an angle (d − 1)π/4. For example, we present the derivative operators in (2) and (3) for d = 1 and d = 2 which correspond to two directions 0, 45, respectively.

$$ \nabla_{1} I(x,y) =I(x,y + 1)-I(x,y) $$
(2)
$$ \nabla_{2} I(x,y) =I(x-1,y + 1)-I(x,y) $$
(3)

Their corresponding filter masks are shown in Fig. 2a and b.

Fig. 2
figure 2

Filter masks corresponding to the FND and SND operators. a and b the FND masks. c and d the proposed SND masks

The stopping function reduces diffusivity at distinctive boundaries. Two widely used stopping functions are:

$$ S(|\nabla_{d} I(x,y)|)=\exp\left[-\left( \frac{|\nabla_{d} I(x,y)|}{\sigma}\right)^{2}\right] $$
(4)

and

$$ S(|\nabla_{d} I(x,y)|)=\frac{1}{1+\left( \frac{|\nabla_{d} I(x,y)|}{\sigma}\right)^{2}} $$
(5)

where σ is a scale parameter which has an important role in smoothing texture in an image. These functions satisfy the following requirement; S(|X|) → 0 when |X|→ such that the diffusion stopped at edges.

3.2 The second neighbour derivative operator

3.2.1 Definition

In the discrete domain, a common way to define a derivative operator is the numerical differentiation. For simplicity, we only consider the 1-D signal. The discrete derivative at location n denoted by ∇I(n) is defined as

$$ \nabla I (n) = \frac{I(n + \tau) - I (n)}{\tau} $$
(6)

where τ is a positive constant and ∇I(n) measures the difference between two pixels. Mathematically, we can use τ = 1 which leads to the first neighbour difference (FND) operator

$$ \nabla^{(1)} I(n) = I(n + 1) - I(n) = M *I $$
(7)

where the convolution mask is M = [− 1 1]. The FND operator is indeed used in anisotropic diffusion filter to determine the image gradient.

On the other hand, setting τ = 2, we obtain the second neighbour difference (SND) operator

$$ \nabla^{(2)} I(n) = \frac{I(n + 2) - I(n)}{2} $$
(8)

The SND operator ∇(2)I(n) can be regarded as the smoothed version of ∇(1)I(n) and is given by:

$$ \nabla^{(2)} I(n) = h * \nabla^{(1)} I(n) $$
(9)

where h = [1 1]/2 which is the simplest low-pass filter. Intuitively, the derivative calculated using the SND can be regarded as the result of low-pass filtering the derivative calculated using the FND. The low-pass filter suppresses high-frequency components in ∇(1)I(n). As such, anisotropic diffusion using the SND will be less sensitive to texture, leading to better smoothing of the texture component in an image.

3.2.2 Characteristics of the SND operator

In this section, we provide a justification why the SND operator is better than the FND in terms of texture removal. As mentioned in the previous section, this is because the SND operator can be considered as a low-pass filtered version of the FND operator. Therefore, the SND operator produces an output with less high-frequency oscillatory components which are due to the texture. By replacing the FND with the SND in the anisotropic diffusion algorithm, as we will show later, we observe an improvement in texture smoothing performance.

We demonstrate this property of the SND operator by conducting an experiment on a 1-D signal which has different structures to highlight the difference between the FND and the SND. Three regions are presented in Fig. 3: a texture region (red band), a strong structure edge (grey band), and a texture+shading (green band). For the signal with only texture information, as shown in Fig. 3 (red band), we can make the following observations. The FND operator leads to an output with high degree of fluctuations, while the SND operator is much less sensitive to the texture. On the other hand, for the part of the signal with prominent structure (grey band), the FND operator produces an output with oscillations near the edge, while the SND operator produces less oscillations. Such oscillations lead to jagged edges in texture smoothing. As shown in Fig. 3 (green band) which represents the signal with texture+shading, it is clearly seen that the response of the FND operator fails to suppress oscillations in the signal, whereas the SND operator smooth them out better. In summary, we can draw the following conclusion. The SND operator is more robust than the FND operator in the smoothing out of textures.

Fig. 3
figure 3

A comparison on the FND and the SND for filtering a 1-D signal with variable structures (highlighted in red, grey and green) and their corresponding smoothing results

To further highlight the difference between using the two operators in texture smoothing, we conducted an experiment using the proposed algorithm (which is detailed in Section 4) to remove the texture in an image. Results are shown in Fig. 4. It can be clearly observed that the performance of the SND operator is better than the FND operator in terms of texture removal while maintaining significant objects’ boundaries.

Fig. 4
figure 4

A comparison on the effect of the FND and the SND map. a Input image. b Pre-smoothed image response and its gradient map. c Extracted structure and its gradient map using anisotropic diffusion with the FND operator. d Extracted structure and its gradient map using the proposed method with the SND operator

3.2.3 The 2-D SND operator

To extend 1-D SND into 2-D, we use the 2-D coordinates of the pixel. The 2-D SND operator is defined as

$$ \nabla^{(2)}_{d} I(x,y) =\frac{I(x+i,y+j)-I(x,y)}{2} $$
(10)

where i ∈ [− 2,0,2], and j ∈ [− 2,0,2]. For simplicity, we only present two examples for the 2-D SND operators in two directions 0 and 45 which are given as

$$ \nabla^{(2)}_{1} I(x,y) =\frac{I(x,y + 2)-I(x,y)}{2} $$
(11)

and

$$ \nabla^{(2)}_{2} I(x,y) =\frac{I(x-2,y + 2)-I(x,y)}{2} $$
(12)

Their corresponding filter masks are shown in Fig. 2c and d.

4 The proposed algorithm

4.1 Two key steps of the algorithm

There are two key steps in the proposed algorithm. An image is first pre-smoothed by a simple filter. The result is further smoothed by the anisotropic diffusion algorithm in which the derivative operations are implemented by using the SND operator. More specifically, referring to (1), the directional derivative operator ∇d is replaced by the SND operator \(\nabla _{d}^{(2)}\). In the following, we provide a detailed discussion on the pre-smoothing filter, the stopping criterion for the anisotropic diffusion, and the parameter selection for the stopping function.

4.2 Pre-smoothing filter selection

In the proposed approach, a pre-processing step is used to help texture elimination in the second step. After the smoothing, homogeneous texture regions should become a relatively flat area (similar values) while salient edges should be preserved. It is clearly seen in Fig. 4b that applying a pre-smoothing filter is not sufficient to get rid of all texture information. A diffusion process is needed to further suppressing the texture. The property of the Gaussian kernel can be used to deal with structure identification in images [25]. After Gaussian filtering, any structure with a scale smaller than the scaling parameter of the Gaussian kernel will disappear. Motivated by this property, we adopted similar idea to alleviate the effect of high-frequency textures.

An interesting question is: what would be the best filter for obtaining the pre-smoothed image? To answer this question, we conducted experiments by employing various simple smoothing filters: median filter, Gaussian low-pass filter, and mean filter.

We artificially create an image which has different structures. This image is employed as a reference image. We then add noise of uniform distribution to simulate the high-frequency oscillatory components. The resulting noisy image I will be processed by one of the three mentioned filters. We tuned parameters of these three filters in such a way that they can achieve highest structure similarity index (SSIM) [35] and peak-signal-to-noise-ratio (PSNR). One row of the reference image, the noisy image, and the filtering results as well as their corresponding SSIM and PSNR, are shown in Fig. 5. We can clearly observed that the median filter can achieve the best SSIM and PSNR.

Fig. 5
figure 5

Results of the proposed method by using different filters as a pre-smoothed stage to obtain the texture smoothing response

To study the performance of the proposed filter over images, we add texture components to a texture-free image which is employed as the ground truth, as illustrated in Fig. 6b. The resulting textured image will be processed by pre-smoothing filters. The radius of the neighbourhood for the median filter and the mean filter is rm. The Gaussian has a scale parameter σg. Each aforementioned parameters of the pre-smoothed filters as well as the diffusion coefficient (σ) are adjusted to produce various smoothing effect, as shown in Figs. 67, and 8 and their corresponding SSIM and PSNR are presented in Fig. 9. We can make the following observations. From Figs. 67, and 8, we can see that the results of the proposed method exhibit robust response to perturbations in parameters settings of pre-smoothing filters and the diffusion coefficient. It is also observed that the PSNR and SSIM results of the proposed filter which uses the median filter and the mean filter are close to each other. These results are summarized in Table 1. Since the median filter has a proven ability in preserving the edges. Therefore, the median filter is adopted as the pre-smoothing filter.

Fig. 6
figure 6

Results of the proposed filter using different settings of parameters for the median filter as a pre-smoothing step. The neighbourhood size of the median filter is defined as (rm)

Fig. 7
figure 7

Results of the proposed filter using different settings of parameters for the Gaussian filter as a pre-smoothing step. The scale parameter of Gaussian kernel is defined as (σg)

Fig. 8
figure 8

Results of the proposed filter using different settings of parameters for the mean filter as a pre-smoothing step. The neighbourhood size of the mean filter is defined as (rm)

Fig. 9
figure 9

The effect of different parameters settings of the pre-smoothed image using median filter, Gaussian filter, and mean filter on the proposed method. The PSNR and SSIM measures were employed to evaluate the image quality in Figs. 67, and 8. The first row and the second row show the SSIM and the PSNR of the image resulted from the proposed algorithm, respectively

Table 1 The best SSIM and PSNR of the images in Figs. 67, and 8

4.3 Stopping criterion and optimal smoothing parameter selection

In the second step, anisotropic diffusion using the directional SND operator is applied to the pre-smoothed image. There are two parameters in this algorithm: the number of iterations and the scale parameter in the stopping function. In this subsection, we study how to set these two parameters.

4.3.1 The number of iterations

It is well known that as the number of iterations k increases, the result of anisotropic diffusion eventually converges into a segmented piecewise constant image. Therefore, the resultant image might not achieve the desirable structure extraction goal. Hence, it is important to identify the number of iterations that provides an optimal trade-off between structure extraction and texture elimination.

In this subsection, we study the stopping criterion leading to a smoothed image which has salient object features similar to the original image. To develop the criterion, we use the cost function of total variation [29] which is defined as:

$$ J(k) = {\sum\limits^{N}_{n = 1}} |I^{0}(n)-I^{k}(n)|^{2} + \gamma {\sum\limits^{N}_{n = 1}|\nabla{I^{k}(n)|}} $$
(13)

where I0(n) is the original image, Ik(n) is the smoothed image from the proposed filter at pixel n, k is the number of iterations, and |∇Ik(n)| is the magnitude of the gradient. The cost function forces the output image to satisfy the following requirements to achieve structure preservation results: (1) it must be smoothed in the textured area except at the location of prominent boundaries, and (2) it must be close to the original image.

To develop a stopping criterion for the proposed anisotropic diffusion step, we can choose the desired output image at the kth iteration that corresponding to the smallest cost. We have conducted experiments by applying the proposed approach on 10 different texture images. The cost function (13) is determined for the output of each iteration. The average costs J(k) over 10 images are shown in Fig. 10. To simplify our experiment, we fix γ = 0.45 and determined the cost function for a range of values of the diffusion parameters σ (0.04, 0.06, and 0.08) and k (the number of iterations). We can clearly observe that the minimum cost occurs at three different iterations (k = 19, k = 10, and k = 8) corresponding to different settings of σ. An interesting question is: what would be the best number of iterations for texture smoothing? To answer this question, we can choose the output image produced in iteration k which is associated with the global minimal cost. This image will achieve the best trade-off between texture smoothing and preserving significant information of the original image. We can observe that for the three settings of σ, the cost function is relatively small for k in the range from 10 to 20. Therefore, we can choose k in this range. It is also worth mentioning that the parameter γ is not required in the proposed filter, but is used in the calculation of the cost function to study the stopping criterion.

Fig. 10
figure 10

Cost function J(k) averaged over the 10 test texture images under different diffusion parameter settings of σ (0.04, 0.06, and 0.08, respectively) and the regularization parameter γ = 0.45. For each σ, minimum cost J(k) occurs at iterations (19, 10, and 8, respectively)

4.3.2 The scale parameter of the stopping function

The scale parameter of the stopping function σ controls the amount of smoothing. Setting a large value of σ, both texture and important structural information will be filtered out. On the other hand, by setting a smaller value of σ, the texture components will be preserved. Therefore, the scale parameter σ controls a trade-off between texture smoothing and distinctive boundaries preservation. To determine the optimal value, as we discussed in the previous section, we set different values of σ to determine the cost. The particular value of σ which achieves the minimum cost is the optimal. From Fig. 10, we can see that minimum cost occurs in three different ks values at (19, 10, 8), respectively. Therefore, we can choose σ at k = 19 as the optimal value which means the optimal value is σ = 0.04.

5 Applications and comparisons

All experiments were conducted using MATLAB on a PC with an Intel-i7 processor running at 3.2 GHZ and 32 GB RAM.

5.1 Experimental study of the proposed approach

5.1.1 Effects of parameters settings

In Fig. 11, we present results of proposed filter under different settings of the two parameters. They are the scale parameter (σ) of the non-linear stopping function that adjust the amount of diffusion and the size (rm) of neighbourhood for the median filter which yields the pre-smoothed image. We can see from Fig. 11 that when (σ) is relatively large (corresponding to higher diffusion strength), major edges such as the face as well as small texture (sand in background) are blurred, while setting a smaller value of (rm) preserves small details such as tiles and dots in the face.

Fig. 11
figure 11

Effect of setting different values of σ and rm on the proposed filter results for a texture image

5.2 Applications

In this Section, we present smoothing results and various applications of the proposed filter and make a comparison with state-of-the-art algorithms.

5.2.1 Structure extraction

We demonstrate that the proposed filter is an effective tool for smoothing out texture information while preserving large objects and sharp edges. We compare the performance of our filter with a group of state-of-the-art texture smoothing algorithms such as relative total variation (RTV) [38], rolling guidance (RGF) [40], bilateral texture filter (BTF) [7], spanning tree filter (STF) [5], static and dynamic filter (SDF) [12], interval gradient filter (IGF) [23], and scale-aware texture smoothing (SAT) [14].

5.2.2 Running speed

Table 2 shows a comparison of the average running time of 10-run for our method versus a group of state-of-the-art algorithms in texture smoothing. Results shows that the running time of the proposed filter is much faster than spanning tree filter (STF) [5], rolling guidance filter (RGF) [40], relative total variation (RTV) [38], bilateral texture filter (BTF) [7], static and dynamic filtering (SDF) [12], scale-aware filter (SAF) [14] and interval gradient filter (IGF) [23].

Table 2 Running time (seconds) of texture smoothing experiments in Figs. 4 and 12. Results of the proposed filter are listed in the last column

Results are shown in Figs. 12 and 13. It can be clearly seen that from the close up view in Fig. 13e the result of STF fails to smooth out texture around distinctive boundaries. On the other hand, while SDF and RTV tend to eliminate oscillatory textures, they also over smooth small details such as details in dorsal and pectoral fins of the fish, which are shown in Fig. 13b and f. We can also observe that the structure extraction results from RGF, BTF, IGF, SAF and our results are similar to each other except that our results have well preserved small strips in dorsal and pectoral fins. In terms of overall visual appearance, the contrast and colour of the image produced by the proposed filter appears to be more natural and similar to the original image.

Fig. 12
figure 12

A comparison of texture smoothing results for the “Fish” image with state-of-the-art algorithms. a Original image, b RTV(σ = 6,λ = 0.02,𝜖 = 0.02,Niter = 4), c RGF (σr = 0.08,σs = 5,Niter = 5), d BTF (k = 5,Niter = 5), e STF (σs = 5,σr = 0.05,σ = 0.02,Niter = 5), f SDF(σ = 2,μ = 50, ν = 400,λ = 200,steps = 1), g IGF (σ = 4.3,𝜖 = 0.032), h SAF (σ = 4,σr = 0.1,Niter = 5), i Our filter (σ = 0.04,λ = 0.25,Niter = 19)

Fig. 13
figure 13

Texture smoothing results of dorsal and pectoral fins parts of the fish

We also provide a quantitative evaluation for the proposed algorithm. The quantitative evaluation has been conducted by adding a texture to a texture-free image which is utilized as the ground truth, as shown in Fig. 14a and b. We purposely tune parameters of each filter to yield an output such that it retains the meaningful structure and removes the texture element as much as possible. The structure similarity index (SSIM) [35] as well as the peak-signal-to-noise-ratio are adopted to measure the quantity of the texture smoothing performance. The subjective evaluation of the proposed algorithm is confirmed by the largest SSIM and PSNR associated with result produced by the proposed method. Table 3 shows that the SSIM and PSNR values of the image produced by our algorithm outperforms RTV, RGF, BTF, STF, SDF, IGF, and SAF.

Fig. 14
figure 14

A comparison of texture smoothing results for the “Flautist” image with a group of state-of-the-art algorithms. a Ground truth (GT), b Texture+ground truth, c RTV(σ = 2.5,λ = 0.015,𝜖 = 0.02,Niter = 4), d RGF (σr = 0.05,σs = 3.9,Niter = 5), e BTF (k = 3,Niter = 15), f STF (σs = 1,σr = 0.05, σ = 0.002,Niter = 9), g SDF(σ = 2,μ = 200,ν = 500,λ = 200,steps = 5), h IGF (σ = 2,𝜖 = 0.032), i SAF (σ = 2.4,σr = 0.07,Niter = 5), j Our filter (σ = 0.04,λ = 0.25,Niter = 20)

Table 3 The objective evaluation of the texture smoothing results using SSIM and PSNR for images in Fig. 14. The SSIM and PSNR of the synthesized image (shown in Fig. 14b) are 0.5330 and 21.244 (dB), respectively. Results of the proposed approach are listed in the last column

5.2.3 Texture transfer

In this application, we utilize the proposed filter in texture transfer which is discussed in [30]. The key idea is to extract desirable textures from a reference image and to embed them into a target image. An example of this application is shown in Fig. 15 in which we transfer cracks from the Mona Lisa painting (the red rectangular in Fig. 15a) into the “An Elegant Beauty” painting Fig. 15c. We first select a region in the Mona Lisa image that contains suitable textures. We then apply the proposed filter to extract the texture layer from this selected region. Finally, the extracted textures are added to the target image. In order to preserve the colour of the target image, the texture transfer is implemented in the luminance component of the YUV colour space. As a result, the rendered image has a natural look with a historical sense conveyed by cracks.

Fig. 15
figure 15

Texture in image a is extracted and presented in b. The texture is blended into the target image c. The result is shown in d

5.2.4 Contrast enhancement

Contrast enhancement is a commonly used tool in image processing. In most contrast enhancement algorithms, the main assumption is that the input image is free of noise and compression artifacts. However, in practical situations, compression occurs due to the data streaming for example via the internet or/and captured by cameras. An example is presented in Fig. 16a. A direct application of a contrast enhancement algorithm on the image will not only improve its appearance but also unintentionally amplify noise/artifacts, especially JPEG compression artifacts, as shown in Fig. 16c and g. To overcome this problem, Yu et al. [24] proposed an approach to avoid boosting JEPG artifacts while improving image appearance. They adopted the texture-structure decomposition method to separate the original image into two layers: structure and texture layers which are processed separately. The structure layer is enhanced based on tone curve adjustment. Then, the texture layer is processed carefully to remove the compression artifacts. The final output is obtained by recombining the two layers. Since the proposed filter can extract image structure while smooth out texture/noise, it can also be used for image contrast enhancement application such as in [24]. We compare the performance of the proposed filter with the contrast enhancement approach in [24]. Results shown in Fig. 16e and i demonstrate that the performance of the two approaches are indeed similar except that our result is sharper, especially around the clouds in the sky.

Fig. 16
figure 16

Image contrast enhancement using tone curve adjustment and texture smoothing. Structure extracted images in (b) and (f) of the original image in (a) are obtained by the method in [24] and the proposed filter, respectively. A direct boosted structure+texture on images (b) and (f) are presented in images (c) and (g). The only boosted structure on images (b) and (f) are shown in images (d) and (h). Enhanced results which are used the boosted structure and the processed texture by the method in [24] and the proposed approach are shown in images (e) and (i), respectively

5.2.5 Superpixel segmentation

In recent years, superpixel segmentation has become an increasingly active research area in image processing and computer vision. In [1], Achanta et al. introduced a new approach which adopts the k-means clustering to yield efficient superpixel segmentation results. This approach is called simple linear iterative clustering (SLIC). Although the SLIC yields a uniform and compact superpixel segmentation, it does not handle a texture image well due to the existence of high-frequency oscillatory components. A direct application of the SLIC on the texture image leads to segments with irregular boundaries, as shown in Fig. 17b. Since the proposed filter has texture-structure decomposition capability, we can use it to improve superpixel segmentation results by mitigating the effect of high oscillatory textures. This will help produce better segmented regions. In Fig. 17, we illustrate a comparison of our superpixel segmentation with the state-of-the-art algorithm [1]. We can see that our results can produce a uniform segmented area. Due to the structure extraction properties of our filter, the boundaries of each segment has more regular patterns compared to the SLIC results.

Fig. 17
figure 17

Results of superpixel segmentation. First column: original images. Second column: results of suprepixel linear iterative clustering (SLIC). Third column: results of our method

5.2.6 Image pencil sketch

Pencil sketch is a non-photorealistic rendering of a real image. This effect can be generated by adding gradients or edges to the smoothed version of the input image to produce a different visual impression, such as pencil-like or cartoon-like images. Lu et al. [26] generate pencil drawing effects by manipulating image gradients. In the texture image, applying the pencil sketch directly can not produce visually pleasing results because of the high correlation of textures with objects in the scene. To produce a better result, texture filtering can be applied to the original image as a pre-processing step and the result is used to obtain the pencil drawing. This leads to a more abstract image and shows the object of interest better. Figure 18 presents pencil sketching results using our filter and the algorithm in [26]. It can be clearly observed that the result from our filter has a clear and more abstract pencil drawing than method in [26].

Fig. 18
figure 18

Pencil sketching results

6 Conclusion

In this paper, a new technique for structure extraction smoothing is introduced. The proposed filter has two main steps. A pre-smoothing filter is first applied to the input image, the result is then processed by a new anisotropic diffusion algorithm which uses the second neighbour derivative (SND) operator. A key difference between the classical anisotropic diffusion and the proposed filter is that the first neighbour derivative (FND) operator is replaced by the SND operator. We have compared the responses of the FND and SND to areas of an image with the texture and edges. We observed that while the SND is less sensitive to texture compared to the FND, the SND performs equally well in preserving edge information. These observations provide strong evidence for the justification of the use of the SND. We have also studied three key issues related to the proposed algorithm: the selection of the pre-smoothing filter, the number of iterations, and the scale parameter which controls the smoothing effect. While we have demonstrated that the performance of the proposed technique in a wide range of applications and conducted comparison with results of state-of-the-art methods, the proposed technique has more application in other computer vision tasks such as visual tracking. We will consider this in our future workFootnote 1. We also show that the running speed of proposed technique is the fastest and its performance is among the best.