Abstract
In today’s rapid growth of volume of multimedia data, security is important yet challenging problem in multimedia applications. Image, which covers the highest percentage of the multimedia data, it is very important for multimedia security. Image segmentation is utilized as a fundamental preprocessing of various multimedia applications such as surveillance for security by breaking a given image into multiple salient regions. In this paper, we present a new image segmentation approach based on frequency-domain filtering for images with stripe texture, and generalize it to lattice fence images. Our method significantly reduces the impact of stripes on segmentation performance. The approach proposed in this paper consists of three phases. Given the images, we weaken the effect of stripe texture by filtering in the frequency domain automatically. Then, structure-preserving image smoothing is employed to remove texture details and extract the main image structures. Last, we use an effective threshold method to produce segmentation results. Our method achieves very promising results for the test image dataset and could benefit a number of new multimedia applications such as public security.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
With recent progress in computing technologies, the volume of high-resolution high-quality multimedia data such as images, video clips, animations, graphics, and audio have been growing exponentially over the past several years. Various multimedia modeling methods are proposed to process multimedia data in different area such as image categorization [25, 31, 32]. In [28], authors propose a feature selection algorithm to filter out the low efficiency features towards fast speech emotion recognition. Wang et al. propose a collaborative sparse coding framework that optimizes the classifiers and dictionary collaboratively for action recognition [20]. Zhang et al. develop a new semantically aware photo retargeting that shrinks a photo according to region semantics, with a mechanism transferring semantics of noisy image labels into different image regions [26]. Weakly supervised fixations prediction which leverages image labels to improve accuracy of human fixations prediction is proposed in [27], which can facilitate many multimedia applications, e.g., image retrieval, action recognition, and photo retargeting.
Image segmentation is a fundamental and widely studied problem which plays an important role in various multimedia applications, such as image annotation [15] and retrieval [6], object recognition [1] and matching [8], scene analysis [10], visual tracking [11] and social media mining [18], and security screening [7]. It is also very important due to its great benefit in multimedia security.
Under different motivations, various image segmentation algorithms have been developed, which have achieved very promising performance. Vese and Chan proposed a new multiphase level set framework for image segmentation using the Mumford and Shah model, for piecewise constant and piecewise smooth optimal approximations [19]. Comaniciu and Meer’s Mean Shift seeks the modes of a non-parametric probability distribution in a feature space, and appears to well respect object details though it tends to split an object into pieces [5]. Shi and Malik’s Normalized Cuts (Ncut) treats image segmentation as a graph partitioning problem and for segmenting the graph, propose the normalized cut criterion which measures both the total dissimilarity between the different groups as well as the total similarity within the groups [17]. Boykov and Jolly proposed an algorithm for general purpose interactive segmentation of N-dimensional images [3]. In [30], graphlets are introduced to represent a photo’s aesthetic features, a probabilistic model is proposed to transfer aesthetic features from the training photo onto the cropped photo. Zhang et al. present a weakly supervised image segmentation algorithm by learning the distribution of spatially structured superpixel sets from image-level labels [29]. A weakly supervised image segmentation model, focusing on learning the semantic associations between superpixel sets, is proposed in [33].
However, for images with stripe texture as shown in Fig. 1 or lattice fence pattern which are captured for security as shown in Fig. 7, existing methods cannot achieve satisfied results because the stripe texture are more salient than the object edges, which makes these methods difficult to differentiate main structure from texture details.
To address the striped-texture image segmentation, we propose a novel framework consisting of three steps: (1)using frequency based band rejection filter to weaken stripes, since stripes in spatial domain correspond to an elongated pattern in frequency domain [13]; (2)taking advantage of structure-preserving image smoothing [22] to remove the noise and highlight the main structures; (3)making use of an effective threshold method to classify pixels as foreground or background. As demonstrated in Fig. 1, the proposed method achieves a high-precision segmentation result. High quality image segmentation greatly improves the results of many applications. We present several fields that our method could benefit in the experiments part.
Our main contributions
-
a framework of image segmentation for images with periodic stripe or lattice fence pattern, which benefit multimedia applications
-
taking use of structure-preserving image smoothing technique for image segmentation.
2 Related work
As far as we know, it is the first work to combine image de-striping by frequency based filtering and structure-preserving image smoothing for striped-texture image segmentation. De-striping and structure-preserving image smoothing are basic tools for remote sensing imagery(SRI) processing and image enhancement respectively. They have been addressed by a variety of works.
2.1 Image de-striping
Stripe noise in remote sensing images not only sharply degrades the image quality in the visual effect, but also risks their suitability for subsequent processing [2]. To remove the stripe and improve image quality, the de-striping methods have been proposed. From the viewpoints of methodology, existing stripe removal methods can be classified into three categories. The filtering-based methods are widely used. Pande-Chhetri et al. develop an image de-striping method based on wavelet analysis with fast and impressive de-striping results [13]. Münch et al. use the combination of wavelet and Fourier analysis to remove horizontal or vertical stripes [12]. Chen et al. propose to use a power finite-impulse response filter to remove the striping-induced frequency components [4].
For the second category method, the main idea is to rectify the distribution of the stripes to a reference distribution. Wegener proposes to calculate the histograms of stripe lines, and then match them to the reference [21]. Rakwatin et al. combine histogram matching with facet filter to reduce stripe noise in Moderate Resolution Imaging Spectroradiometer (MODIS) data [14].
The variational destriping methods regard the stripe removal problem as an ill-posed inverse problem. Shen et al. propose the Huber-Markov variational model to remove the stripes with spatial local adaptive edge-preserving ability [16]. Chang et al. treat the image and stripe component equally, and convert the image de-striping task as an image decomposition problem [23].
These works are either time-consuming or non-automatic. In this paper, we propose a filtering-based methods method that is automatic, effective and able to handle stripes in any direction.
2.2 Structure-preserving image smoothing
Structure-preserving image smoothing aims to extract main structures of images while removing texture details. There are two types of method.
One contains the optimization based filters. Xu et al. develop a robust method on new local variation measures to separate structure from texture [22]. Karacan et al. use region covariance matrices to capture local structure and texture information [9].
Another kind of edge-preserving smoothing techniques is weighted average based smoothing, which smooths an input image via weighted average affinities between neighboring pixel pairs. Zhang et al. introduce a scale-aware filter called as rolling guidance filter (RGF) [34]. RGF iteratively performs Gaussian filtering to remove small structures and joint bilateral filtering to recover edges. Zhang et al. design a new edge-aware structure, named segment graph, to represent the image and further develop a novel double weighted average image filter (SGF) based on the segment graph [24].
However, these methods mentioned above cannot handle images with stripes or lattice fence pattern because they take stripes and lattice fences as structures instead of texture.
Our approach combining de-striping with structure-preserving image smoothing achieves promising results on the test image dataset.
3 Approach
Our method has three distinct but interrelated stages- 1)Weakening stripes by frequency based filtering 2)Removing texture detail and extracting main structures 3)Producing segmentation results. Figure 2 demonstrates the procedure.
3.1 Frequency based filtering
Fourier transform is an important tool for image processing. For an image I(x,y) of size M×N, its 2D discrete Fourier transform is defined as
According to [13], the striping patterns in the original image will be captured in the frequency domain as an elongated pattern in the direction perpendicular to the stripes. For example, horizontal stripes (Fig. 1a) are presented as a vertical central narrow band in Fourier domain (and contrarily, vertical ones as horizontal narrow band). Therefore, to weaken stripe texture, we apply a band rejection filter on the image spectrum.
Then the filtering process in frequency domain is showed as equation:
in which H(u,v) denotes the frequency filter and M(u,v) is the filtered fourier power spectrum of the original image.
The de-striping method consists of three steps as shown in Fig. 3.
-
1.
Get the power spectrum of the original image by 2D FFT.
-
2.
Use a band rejection filter on the spectrum to mask frequency components which cause stripes.
-
3.
Apply inverse 2D FFT to the masked spectrum.
It involves two key components for designing the band rejection filter.
(1) Locate the stripe frequency components
We propose an effective method to detect which frequencies should be suppressed by the band rejection filter. Since stripes in spatial domain are reflected by narrow bands of high amplitude values in a direction orthogonal to the stripes [13], we accumulate the intensity values of the image spectrum along the narrow band direction to get the discrete accumulation curve S. The narrow bands of the most likely stripe frequencies are detected by finding the highest values of T, especially by looking for local maxima. We define the bands as stripe-band. The peaks show where the rejection filter should be applied.
For the image with horizontal stripes in Fig. 1a, its accumulation curve in spectrum (Fig. 4b) is shown in Fig. 4a. The red points imply three bands of frequencies along vertical axis to be rejected. Our method is not limited to vertical or horizontal stripes.
(2) Design the band rejection filter
Based on the fact that the low frequencies in the Fourier transform correspond to the smooth areas of an image, we should separate the low frequencies from the stripe frequencies band detected above so as to preserve the image information while weakening the stripes. According to [35], we extend stripe-band by width W and project the values of the new band to the vertical line to get another accumulation curve T by (3).
in which F(u,v) is the fourier power spectrum of the original image, and W represents the width of narrow bands of high amplitude values in the spectrum.
By finding local extreme values, we look for two points that separate the low frequencies in the center of the detected frequencies band. Then the region between the two points will be separated from the rejection band area. We treat each band the same way. In Fig. 4, the region between two points along central band is separated from the band.
We apply the rejection filter to the spectrum to weaken the stripes, and then restore the image. An example is shown in Fig. 2a. As we see in the boosted image details, stripes are significantly weakened.
In the case of lattice fence image, it can be treated as two stripes in different directions. We apply the method mentioned above to image in both directions to weaken lattice fence texture.
3.2 Image smoothing
By filtering the images in the frequency domain, we remove the stripe or fence texture. In this de-striping process, noise is introduced, which makes segmentation more challenging. Using the images obtained above as input, we apply structure-preserving image smoothing method on filtered image to smooth image and extract the main structure of the image. In our paper, we use relative total variation(RTV) model to perform image smoothing [22].
3.2.1 Relative total variation (RTV) model
Relative total variation(RTV) model is proposed to capture the nature of structure and texture which achieves promising results [22].
RTV model contains a general pixel-wise windowed total variation measure, written as
where q belongs to R(p) and R(p) is the rectangular region centered at pixel p. Dx(p) and Dy(p) are windowed total variations in the x and y directions for pixel p, which count the absolute spatial difference within the window R(p). gp,q is a weighting function defined according to spatial affinity, expressed as
where σ controls the spatial scale of the window.
To help distinguish prominent structures from the texture elements, besides D, RTV model also contains a windowed inherent variation, expressed as
L captures the overall spatial variation.
The objective function is finally expressed as
Thanks to the proposed relative total variation measure, the RTV model makes main structures of images easily distinguished from detail texture and achieves promising results of structure-preserving image smoothing.
3.2.2 Examples
Figure 5b shows the result of the relative total variation model used directly on a striped texture image. The reason why the performance is poor is that structure-preserving images smoothing implicitly assumes that salient edges only come from object contours/boundaries which is not true for stripe images. In this model, parameter σ represents the size of the texture elements. When σ is set large, the stripes are considered as large texture elements, but this will result in more blurred images.
Frequency-domain filtering makes stripes less obvious, the RTV model considers these not as main structures but as texture details. Comparison results are shown in Fig. 5b and d. Parameters, e.g. σ, λ(smoothness), of two approaches for each image are set to be the same values. In Fig. 5b, the stripes are not smoothed, while they are smoothed in Fig. 5d.
Figure 6 shows the comparison of results of our method and method without lattice fence weakening on a lattice fence pattern image.
3.3 Segmentation
After structure-preserving image smoothing, smoothed images contain a few intensity values and the main structures are extracted. In the third step, we employ an effective threshold method for segmentation. In Figs. 5c and e, 6c and e, segmentation results of two approaches are presented. Our approach achieves superior results.
4 Experimental results
Our Method is applied on a variety of test images, including fence image of prison and textile dataset which contains images with stripe texture obtained from a textile mill. Very promising results are achieved.
As shown in Fig. 7, we have achieved great performance of image segmentation on prison image occluded by lattice fence. It is known to all that image segmentation plays an important role in surveillance security area such as train stations and prisons. But it remains challenging to detect and segment meaningful objects such as human out of cluttered background, especially suffering from stripe texture noise and occlusion by lattice fence. Result in Fig. 7 shows that our approach works well in these situations, which has great value in public security area such as prison surveillance and crime scene investigation.
Figure 8a shows a textile image with background formed by stripe texture. The structure-preserving image smoothing algorithms can not smooth the stripes directly because large gradients of stripes make them mistaken as structures. As we can see, our method separates the flower pattern completely from the stripes in Fig. 8b. Our approach is valuable in textile industry. Based on the satisfied segmentation result of stripe jacquard fabric image, we could combine textile printing with jacquard weave to produce new textile products which are both colorful and high-grade. First, we get image of jacquard fabric and Segment the flower pattern. Second, we register the designed color pattern with images of deformed jacquard fabric. It will promote textile industry.
Our approach could benefit other application. Figure 9a shows an image of characters embedded on fence. With our method applied to the image, the characters are successfully extracted as shown in Fig. 9b. The optical character recognition(OCR) in natural scenes of characters embedded on fence benefits from our new approach.
5 Conclusions and future work
We have presented an image segmentation method for images with stripe or fence texture, with applications to multimedia security. Our approach consists of three steps: de-striping by frequency filtering, structure-preserving image smoothing and classifying pixels as foreground and background. Very promising results have been achieved by our method. In future work, we will focus on image segmentation with repeated patterns not restricted to stripe patterns, which could benefit more multimedia applications.
References
Bao B-K, Liu G, Hong R, Yan S, Changsheng X (2013) General subspace learning with corrupted training data via graph embedding. IEEE Trans Image Process 22(11):4380–4393
Bouali M, Ladjal S (2011) Toward optimal destriping of modis data using a unidirectional variational model. IEEE Trans Geosci Remote Sens 49(8):2924–2935
Boykov Y, Jolly M-P (2001) Interactive graph cuts for optimal boundary and region segmentation of objects in nd images. In: Proceedings of Eighth IEEE international conference on computer vision. Vancouver, pp 105–112
Chen J, Shao Y, Guo H, Wang W, Zhu B (2003) Destriping cmodis data by power filtering. IEEE Trans Geosci. Remote Sens 41(9):2119–2124
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Fang M-Y, Kuan Y-H, Kuo C-M (2012) Effective image retrieval techniques based on novel salient region segmentation and relevance feedback. Multimed Tools Appl 57:501–525
Grady L, Singh V, Kohlberger T, Alcino C, Bahlmann C (2012) Automatic segmentation of unknown objects, with application to baggage security. In: Proceedings of IEEE European conference on computer vision. Firenze, pp 430–444
Hong C, Zhu J, Jun Y, Cheng J, Chen X (2014) Realtime and robust object matching with a large number of templates. Multimed Tools Appl 75:1459–1480
Karacan L, Erdem E, Erdem A (2013) Structure-preserving image smoothing via region covariances. ACM Trans Graph 32:6
Li T, Chang H, Wang M, Ni B, Hong R, Yan S (2015) Crowded scene analysis: a survey. IEEE Trans Circ Syst Video Technol 25(3):367–386
Lin C, Pun C-M, Huang G (2016) Highly non-rigid video object tracking using segment-based object candidates. Multimed Tools Appl
Mnch B, Trtik P, Marone F, Stampanoni M (2009) Stripe and ring artifact removal with combined wavelet - fourier filtering. Opt Express 17(10):8567–8591
Pande-Chhetri R, Abd-Elrahman A (2011) De-striping hyperspectral imagery using wavelet transform and adaptive frequency domain filtering. ISPRS J Photogramm Remote Sens 66(5):620– 636
Rakwatin P, Takeuchi W, Yasuoka Y (2009) Restoration of aqua modis band 6 using histogram matching and local least squares fitting. IEEE Trans Geosci Remote Sens 47(2):613–627
Sang J, Changsheng X, Liu J (2012) User-aware image tag refinement via ternary semantic analysis. IEEE Trans Multimed 14(3-2):883–895
Shen H, Zhang L (2009) A map-based algorithm for destriping and inpainting of remotely sensed images. IEEE Trans Geosci Remote Sens 47(5):1492–1502
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Tang J, Tao D, Qi G-J, Huet B (2014) Social media mining and knowledge discovery. Multimed Syst 20(6):633–634
Vese L, Chan T (2002) A multiphase level set framework for image segmentation using the mumford and shah model. Int J Comput Vis 50(3):271–293
Wang W, Yan Y, Zhang L, Hong R, Sebe N (2016) Collaborative sparse coding for multi-view action recognition. IEEE Multimed Mag 23(4):80–87
Wegener M (1990) Destriping multiple sensor imagery by improved histogram matching. Int J Remote Sens 11(5):859–875
Xu L, Yan Q, Xia Y, Jia J (2012) Structure extraction from texture via relative total variation. ACM Trans Graph 31:6
Yi C, Yan L, Tao W, Zhong S (2016) Remote sensing image stripe noise removal: from image decomposition perspective. IEEE Trans Geosci Remote Sens 54 (12):7018–7031
Zhang F, Dai L, Xiang S, Zhang X (2015) Segment graph based image filtering: fast structure-preserving smoothing. In: Proceedings of IEEE conference on computer vision and pattern recognition. Boston, pp 361–369
Zhang L, Hong R, Gao Y, Ji R, Dai Q, Li X (2016) Image categorization by learning a propagated graphlet path. IEEE Trans Neural Netw Learn Syst 27(3):674–685
Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl 12 (3):37
Zhang L, Li X, Nie L, Yi Y, Xia Y (2016) Weakly supervised human fixations prediction. IEEE Trans CyBern 46(1):258–269
Zhang L, Song M, Li N, Bu J, Chen C (2009) Feature selection for fast speech emotion recognition. In: Proceedings of ACM international conference on multimedia. Beijing, pp 753– 756
Zhang L, Song M, Liu Z, Liu X, Bu J, Chen C (2013) Probabilistic graphlet cut: exploring spatial structure cue for weakly supervised image segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition. Portland, pp 1908–1915
Zhang L, Song M, Qi Z, Liu X, Jiajun B, Chen C (2013) Probabilistic graphlet transfer for photo cropping. IEEE Trans Image Process 21(5):2887–2897
Zhang L, Wang M, Hong R, Yin B-C, Li X (2016) Large-scale aerial image categorization using a multitask topological codebook. IEEE Trans CyBern 46 (2):535–545
Zhang L, Yang Y, Wang M, Hong R, Nie L, Li X (2016) Detecting densely distributed graph patterns for fine-grained image categorization. IEEE Trans Image Process 25(2):553–565
Zhang L, Yi Y, Gao Y, Wang C, Yi Y, Li X (2014) A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans Image Process 23(9):4150–4159
Zhang Q, Shen X, Xu L, Jia J (2014) Rolling guidance filter. In: Proceedings of European conference on computer vision. Zurich, pp 815–830
Zhang Z, Shi Z, Guo W, Huang S (2005) Adaptively image de-striping through frequency filtering. In: ICO20: Opt. Inf. Proc., Proc. SPIE: 6027, pp 989–996
Acknowledgments
This work was supported by the National Natural Science Foundation of China (grant number 61472348).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ren, J., Chen, G., Li, X. et al. Striped-texture image segmentation with application to multimedia security. Multimed Tools Appl 78, 26965–26978 (2019). https://doi.org/10.1007/s11042-017-4479-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4479-2