Region-based feature combination for robust salient object detection

Singh, Vivek Kumar; Kumar, Nitin; Nand, Parma

doi:10.1007/s11042-023-17083-1

Region-based feature combination for robust salient object detection

Published: 28 September 2023

Volume 83, pages 35159–35174, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Region-based feature combination for robust salient object detection

Download PDF

131 Accesses
Explore all metrics

Abstract

The diversity of natural images in terms of visual features is useful in saliency detection. The complementary visual features jointly improve the performance of salient object detection. In this paper, we introduce a novel region-based feature combination approach that utilizes the diversity of visual features over image regions for robust salient object detection. The proposed approach works in four steps: (i) region formation, (ii) feature extraction, (iii) region-wise weight learning and (iv) region-based feature combination. Region formation is carried out using simple linear iterative clustering (SLIC) algorithm. Then, the features are extracted using Boundary Connectivity (BC), Contrast Cluster (CC), and Minimum Directional Contrast (MDC) methods. These features are then used for learning weights vectors for each region. Our major contribution is in step four where a novel dynamic weighted feature combination method is proposed. In this step region-wise integration weights are obtained by using a nature inspirited optimization algorithm called Constrained Particle swarm optimization (CPSO). Then salient features are region-wise combined with their dynamic relevance for final saliency map. The proposed method is compared with eight state-of-the-art saliency detection methods on five public available saliency benchmark datasets namely MSRA10K, DUT-OMRON, ECSSD, PASCAL, and SED2. The experimental results demonstrate that the proposed method performs better than state-of-the-art methods in terms of Precision, Recall, F-measure and Mean Absolute Error while comparable in terms of AUC and ROC curve.

U-FIN: Unsupervised Feature Integration Approach for Salient Object Detection

A novel hybrid approach for salient object detection using local and global saliency in frequency domain

Article 15 July 2015

Thresholding in salient object detection: a survey

Article 09 November 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Salient object detection (SOD) aims at distinguishing the most significant objects in a given image and is helpful to well understand the image scene. This process describes the characteristics of objects or regions of a digital image to attract human attention. Typically, SOD methods take a digital image as input and generate a probability map called saliency map [1]. This map describes salient and non-salient objects or regions in the image which is applied as a pre-processing task in various computer vision applications such as image retrieval [2], image scene understanding [3], etc. Over a few decades, various saliency detection approaches have been suggested to identify salient objects or regions in natural images. In general, these approaches are broadly classified into two categories: bottom-up approaches [1, 4] and top-down approaches [5, 6]. In order to accomplish the saliency detection process in an effective manner, the method should be able to discriminate the salient object from complex background. Visual data caries various effective features that play an important role to grab human visual attention from complex backgrounds.

Feature combination is one of the most effective approach to improve saliency detection. There are many existing research works about the visual features combination such as [7, 8], which have improved performance of salient object detection. Typically, salient object detection methods combine features that complement each other such that one feature identify saliency in some regions while others capture saliency in remaining regions. Based on this features combination approach, a conditional random field (CRF) based feature combination method was introduced by Liu et al. [7]. They also extracted novel salient features from the input image namely center-surround histogram, multi-scale contrast and color spatial distribution that enable the method to capture saliency value regionally, locally, and globally. Some methods [9,10,11] tried to address the salient feature combination based on metaheuristic search algorithms that effectively optimize feature combination weights to define an appropriate ratio of the salient features in terms of learned weights. These methods learn a set of weights by using different metaheuristic search algorithms and combine the visual features extracted from the input image uniformly in entire image regions for saliency detection. However, the significance of visual features to describe saliency value may be varying across different regions. Thus, These methods may fail to find optimal feature weights for feature combination that dynamically combined image regions which degrade saliency detection performance in a cluttered and complex background. This shortcoming arises because these feature combination approaches do not consider the regional importance during feature combination.

Motivation

In order to better handle the combination of salient features for saliency detection, region wise feature combination strategy is required for integrating regionally most important salient features. In this strategy, feature combination weights are learned for each region instated of the entire image. Such approaches regionally combined salient features with dynamic weights that to improve saliency detection performance even in complex images. For better understanding, several visual cases are shown in Fig. 1, where the third column from left (i.e., Fig. 1 (c)) shows the saliency results of our region-based features combination method and fourth, fifth and sixth columns (i.e., Fig. 1 (d), (e), and (f) respectively ) illustrate the saliency results of previous metaheuristic based feature combination methods such as Constrained Particle swarm optimization (CPSO) [9], Biogeography-based optimization (BBO) [10] and SOFT [11]. The figure clearly shows that the proposed method achieves a significant improvement over state-of-the-art saliency methods [9,10,11]. Inspired by the above advantage, we propose a novel feature combination approach which considers the region based importance of visual features for detecting the salient object.

In this paper, we propose a salient object detection method that combines various salient features based on image regions which improve the performance of saliency detection. Initially, the input image is partitioned into meaningful homogeneous regions by using simple linear iterative clustering (SLIC) [12]. Then, several salient features are extracted from the input image based on the criteria of complementary characteristics of various salient features. Such a feature selection approach increases the performance of the feature combination method due to each feature complementing each other such that computes better saliency values in different regions of the image. Afterwards, a metaheuristic optimization algorithm is utilized to learn region-wise feature weights. This weight learning step provides a set of weights for each feature in different regions that describe the dynamic ratio of various features across the image region wise. Lastly, salient features are dynamically combined with the help of learnt weights over the image in a region wise manner. The proposed method highlights salient regions uniformly even in a cluttered and complex background and improves the saliency detection performance due to its ability to capture the region-wise importance of salient features. Here, our important contributions are summarized as follows:

1.
We have proposed a novel feature combination approach which combines various visual features based on image regions.
2.
This method improves saliency detection performance in cluttered and complex image scenes. To the best of our knowledge, the region-based feature combination approach has not yet been used in saliency detection.
3.
A metaheuristic optimization algorithm has been employed region-wise to optimize the contribution of visual features for saliency detection.
4.
Empirical results demonstrate that the proposed model consistently achieves better performance than eight state-of-the-art saliency detection methods in terms of several performance measures over five publicly available benchmark datasets.

The rest of this paper is organized as follows: Section 2 describes related work of saliency detection in brief. Section 3 provides the details of the proposed method. In Section 4, experimental results are presented and analyzed. Finally, conclusion and future work are stated in Section 5.

2 Related work

In the past few decades, an increasing number of salient object detection methods have been suggested and reported a significant improvement in the saliency detection performance [13]. First, Itti et al. [1] introduced a computational saliency detection method which extracted hand-crafted visual features at multiple scales and linearly combined them to generate a saliency map. Hou and Zhang [14] discovered the spectral residual based saliency approach and its general ability to detect proto-objects. Zhang et al. [15] proposed a bottom-up saliency method that exploited natural statistics based on Bayesian framework to detect salient objects. Seo and Milanfar [16] exploited self-resemblance approach for predicting saliency. Goferman et al. [17] suggested saliency prediction by considering four principles of human attention which is a context-aware saliency detection approach. Murray et al. [18] introduced an effective saliency detection method based on color appearance in human vision.

Further, many graph-based saliency detection methods [19,20,21,22] achieve promising performance. Zhu et al. [19] suggested a saliency prediction approach via affinity graph learning and weighted manifold ranking. Nour et al. [20] proposed a novel multi-graph-based method for salient object detection, in which an edge weight matrix is constructed by utilizing color, spatial and background labels. Wang et al. [21] presented a saliency detection via incorporating multifeature-based boundary ranking and boundary connectivity ranking. Wang et al. [22] suggested a graph-based saliency detection via a learning joint affinity matrix. In consideration of feature combination based saliency detection approaches, Liu et al. [7] suggested a saliency detection method which extracts three novel features and trained a conditional random field model to learn a set of weights for feature combination.

In recent years, the extensive application of metaheuristic optimization algorithms in solving various optimization problems has unfolded new ideas for feature combination in salient object detection. In this direction, some feature fusion approaches have been suggested via metaheuristic optimization algorithms. Singh et al. [9] introduced a novel features combination approach based on Constrained Particle swarm optimization (C-PSO). Wang et al. [23] presented a visual feature fusion framework via Biogeography-based optimization (BBO) [23] and its variants metaheuristic optimization algorithms. The method utilized the fitness function suggested in Singh et al. [9] model and feature maps are extracted using Liu et al. [7] model. In [11], an efficient features combination framework has been proposed based on Teacher-learning-based Optimization (TLBO). The method introduced a novel fitness function to effectively optimize TLBO learning parameters. In many researches, deep learning based feature combination methods are employed to improve the performance of significant visual region detection. Li et al. [8] introduced a hierarchical Feature Fusion Network (HFFNet) that fused features hierarchically to extract high-level semantic information and low-level edge information. Gao et al. [24] suggested a mutually supervised few-shot segmentation network that combined visual features for image segmentation. In [25], local and global attention mechanisms are combined for dish image recognition.

3 Proposed method

In this section, we present the proposed method, based on region wise dynamic feature combination for salient object detection. This method dynamically combined various salient features on different image regions. The goal is to provide appropriate weight for each feature region wise that can capture better saliency value from each region during the combination of salient features. Therefore, the proposed method is robust in finding the salient object in various challenging natural image scenarios. The purpose of robustness in salient object detection is to consistently maintain performance with various challenging natural image scenarios such as cluttered background, heterogeneous foreground and low contrast between background and foreground, etc. The key feature of the proposed method is region wise visual feature combination which dynamically combined basic visual features according to their importance in various image regions. In this method, an input image is segmented into homogeneous regions that preserve all the significant visual attention cues while removing destructive visual information. Then, various salient features are extracted from the image based on complementary characteristics. These features are redefined corresponding to each region that describes similar features for each image element of the regions. Then, a nature inspirited optimization algorithm called Constrained Particle swarm optimization (CPSO), which is a modified nature inspirited optimization algorithm form of Particle swarm optimization (PSO) [26] is employed for learning regional weights. Here, the selection of CPSO is based on measuring saliency results quality and computational time [11]. Afterwards, salient features are regionally combined through learn weights and generate high-quality saliency maps.

As illustrated in Fig. 2, the proposed method consists of four steps including region formation, feature extraction, region-wise weight learning and region-based feature combination. In region formation, the input image is partitioned into meaningful regions and some selected feature extraction techniques suggested in state-of-the-art saliency detection methods are employed to extract the feature in the feature extraction step. In the third step, region-wise weight vectors are learnt using a nature inspirited optimization algorithm. Lastly, extracted features are combined using learnt weight vectors for generating the final saliency map.

3.1 Region formation

An image region can be defined as a set of pixels that contain homogeneous features. The proposed model employs simple linear iterative clustering (SLIC) [12] to partition input image into meaningful regions. This pre-processing breaks the input image I into n regions, which is defined as $R=\{1, 2, ...,n\}$.

3.2 Feature extraction

Visual feature plays a significant role in visual attention formulation. In this model, three visual features are identified from the state-of-the-art salient object detection methods and represented as (${\textbf {f }}_1, {\textbf {f }}_2, {\textbf {f }}_3$). The feature ${\textbf {f }}_1$ is extracted from input image by using a background measure based on Boundary Connectivity (BC) [27], which represents a highly reliable background knowledge. The Contrast Cluster (CC) [28] is extracted and represented as ${\textbf {f }}_2$ that captures distinctive visual feature from the input image. Similarly, ${\textbf {f }}_3$ is extracted as a Minimum Directional Contrast (MDC) [29], which represents different spatial direction contrasts.

3.3 Region-wise weight learning

In the proposed model, the extracted features are combined for salient object detection. There are different combination techniques such as linear and weighted linear which can be employed on the features for computing salient regions. The proposed feature combination approach combine features with a weighted linear manner. This approach requires an appropriate weight vector for a weighted linear combination. Here, Constrained Particle swarm optimization (CPSO) which is a modified nature inspirited optimization algorithm form of Particle swarm optimization (PSO) [26] is used to learn weights for each image. In [9], CPSO is applied to learn a weight vector which is used to combine the entire image visual features for salient object detection. In the proposed model, the CPSO is utilized in a different manner to enhance the effectiveness of the feature combination approach. In [9], a weight vector is learnt for the entire image and regions wise importance is not captured. The proposed model address the drawback by learning a weight vector for each meaningful image region which improves saliency detection performance.

Let $\varvec{\theta }= \{\varvec{\theta }_{1}, \varvec{\theta }_{2}, ..., \varvec{\theta }_{n}\}$ is the weight vector employed to combine given image features, where $\varvec{\theta }_{n}=\{\theta _{n_1}, \theta _{n_2}, ..., \theta _{n_d}\}$ be the weight vector for n-th image region. To obtain the optimal weight vector $\varvec{\theta }$, CPSO is employed region wise on input image. Further, a fitness function similar to [9] is used in the proposed model. Initially, the visual features of n-th image region are integrated by using the weight vector $\varvec{\theta }_{n}$ as follows:

$$\begin{aligned} \textbf{S}(\varvec{\theta }_{n})=\sum _{d=1}^{N} \theta _{n_d} \times \mathbf {\textit{f}}_{n_d} \end{aligned}$$

(1)

where $\mathbf {\textit{f}}_{n_d}$ is d-th feature for n-th region, N is number of features and $\theta _{n_d}$ is a weight for d-th feature in n-th region. The initial saliency map $\textbf{S}(\varvec{\theta })$ is represented as follows:

$$\begin{aligned} \textbf{S}(\varvec{\theta })=\{\textbf{S}(\varvec{\theta }_{1}), \textbf{S}(\varvec{\theta }_{2}), ..., \textbf{S}(\varvec{\theta }_{n})\} \end{aligned}$$

(2)

The initial saliency map $\textbf{S}(\varvec{\theta })$ is partitioned into two regions namely the salient region and non-salient region by applying an adaptive threshold which is calculated in two steps. First, canny edge operator is applied on the $\textbf{S}(\varvec{\theta })$ to obtain the edge map $\textbf{E}(\varvec{\theta })$ as follows [9]:

$$\begin{aligned} \textbf{E}(\varvec{\theta })={\left\{ \begin{array}{ll} 1 &{} \text { if } pixel \in edge(\textbf{S}(\varvec{\theta })) \\ 0 &{} \textit{otherwise} \end{array}\right. } \end{aligned}$$

(3)

Then, the saliency values at the edge map $\textbf{E}(\varvec{\theta })$ are obtained and mean is computed which is called threshold $\eta (\varvec{\theta })$ as follows [9]:

$$\begin{aligned} \eta (\varvec{\theta })=\frac{\sum _{p \in P_{I}}\textbf{E}(p,\varvec{\theta }). \textbf{S}(p,\varvec{\theta })}{\sum _{p \in P_{I}}\textbf{E}(p,\varvec{\theta })} \end{aligned}$$

(4)

where $p_{I}$ denotes the set of image pixel. The binary map $\textbf{B}(\varvec{\theta })$ is computed by applying the threshold value $\eta (\varvec{\theta })$ on the $\textbf{S}(\varvec{\theta })$ as follows [9]:

$$\begin{aligned} \textbf{B}(\varvec{\theta })={\left\{ \begin{array}{ll} 1 &{} \text { where } \textbf{S}(\varvec{\theta }) \ge \eta (\varvec{\theta }) \\ 0 &{} \textit{otherwise} \end{array}\right. } \end{aligned}$$

(5)

In visual attention, the contribution of salient regions and non salient regions should be maximized and minimized, simultaneously. Consequently, the fitness function $fit (\varvec{\theta })$ is defined as follows [9]:

$$\begin{aligned} fit (\varvec{\theta })= \sum _{p \in P_{sal}}(1-\textbf{S}(p,\varvec{\theta }))+ \sum _{p \in P_{nonsal}}\textbf{S}(p,\varvec{\theta }) \end{aligned}$$

(6)

where $P_{sal}$ and $P_{nonsal}$ denote the set of salient pixels and non salient pixels, which is obtained by jointly considering $\textbf{S}(\varvec{\theta })$ and $\textbf{B}(\varvec{\theta })$. For optimizing the weight $\varvec{\theta }$, the $fit (\varvec{\theta })$ is to be minimized and optimized weight is represented as $\hat{\varvec{\theta }}$.

3.4 Region-based feature combination

The weight vector $\hat{\varvec{\theta }}$ learnt by using CPSO is used to combine the extracted visual features ${\textbf {f }}=\{{\textbf {f }}_1$, ${\textbf {f }}_2$, ${\textbf {f }}_3\}$ region wise to generate final saliency map ${\textbf {S}}$. For n-th region the saliency map ${\textbf {S}}(\hat{\varvec{\theta }}_{n})$ is computed as follows:

$$\begin{aligned} \textbf{S}(\hat{\varvec{\theta }}_{n})=\sum _{d=1}^{N} \hat{\varvec{\theta }}_{n_d} \times \textbf{ f }_{n_d} \end{aligned}$$

(7)

The saliency map ${\textbf {S}}$ is obtained as follows:

$$\begin{aligned} \textbf{S}=\{\textbf{S}(\hat{\varvec{\theta }}_{1}), \textbf{S}(\hat{\varvec{\theta }}_{2}), ..., \textbf{S}(\hat{\varvec{\theta }}_{n})\} \end{aligned}$$

(8)

4 Experimental results

In order to validate efficacy of the proposed model, we have conducted extensive experiments on five publicly available datasets viz. MSRA10K (10,000 images) [30], DUT-OMRON (5,168 images) [31], ECSSD (1,000 images) [32], PASCAL-S (850 images) [33] and SED2 (100 images)[34] against eight saliency detection methods which includes three metaheuristic feature fusion methods namely, C-PSO [9], BBO [10] and SOFT [11] and five traditional methods such as SR [14], SUN [15], SeR [16], CA [17], and SIM [18]. For the quantitative evaluation, six widely applied evaluation metrics including Precision, Recall, Receiver Operating Characteristics (ROC), F-Measure, Area Under the Curve (AUC) and Mean absolute error (MAE) are employed to evaluate the performance of the proposed model with the compared state-of-the-art saliency methods.

4.1 Qualitative comparison

To show the effectiveness of the proposed method against compared saliency detection methods, a visual comparison is presented in Fig. 3. It is to be noted all traditional compared saliency detection methods namely SR [14], SUN [15], SeR [16], CA [17], and SIM [18] are facing difficulties to generate uniformly highlighted salient regions in saliency maps on almost all images as shown in Fig. 3. On the other hand, the proposed model generates high-quality saliency maps not only on a simple background or homogeneous objects image such as columns 1, 5 and 7 but also on complex and cluttered background images such as columns 2, 3, 4 and 5. The reason for this response is the combination of various salient features which are complement to each other in spatially heterogeneous visual regions.

In addition, Fig. 3 presents the visual comparison of the proposed method with state-of-the-art feature combination methods [9,10,11] based on metaheuristic optimization algorithm for saliency detection. It can be easily seen that the C-PSO [9] captures salient objects with unnecessary background information which degraded its performance, BBO [10] performs similarly with a slight improvement on challenging images such as columns 3 and 5. The SOFT [11] method generates good quality saliency maps but it also fails to remove background noises which impact the saliency detection performance. These visual samples of saliency maps validate that previous works fail to suppress the background noises and uniformly highlight salient regions which reduce their performance. However, it can be easily observed from Fig. 3 that the proposed region-wise feature combination method significantly improves the saliency detection performance than counterparts in previous works due to its dynamic feature combination strategy. As illustrated in Fig. 3, the proposed method uniformly highlights salient regions and effectively suppresses unnecessary background information in various challenging scenario images. The superior performance of the proposed method confirms its robust characteristic in handling various types of challenging natural images such as cluttered backgrounds and heterogeneous foregrounds. Overall, qualitative results validate that the proposed model provides a clear separation between foreground and background even in complex images which validates the robust feature of the proposed model.

4.2 Quantitative comparison

For better understanding, we have also conducted quantitative experiments of the proposed model with traditional compared saliency detection methods and counterparts previous methods. Figure 4 shows Precision scores of the proposed method and state-of-the-art compared saliency detection methods on all five datasets. From Fig. 4, it can easily observe that the proposed method significantly improves the Precision score from all the compared methods including C-PSO [9], BBO [10] and SOFT [11] on all datasets. This experimental result confirms that the proposed method improves detection accuracy more than all the compared methods. The reasons for this outcome of the proposed method are effective selection and the combination of basic salient features. Therefore, this validates the proposed dynamic region-wise feature combination approach and complementary features selection technique. Figure 5 provides Recall scores of the proposed method and state-of-the-art compared saliency detection methods on all five datasets. As illustrated in the Fig. 5, the proposed method outperforms against all compared saliency detection methods including counterparts previous methods [9,10,11] across all five datasets. This observation validates that the proposed method significantly improves the performance of saliency detection in terms of completeness which refer to completely capturing salient objects present in the natural image. Such performance supports the incorporation of complementary salient features which complement each other in visually heterogeneous regions to maximize the identified saliency regions.

Further, the superior performance of the proposed method against compared saliency detection methods in terms of F-measure on all five datasets is shown in Fig. 6. It is worth noting that the proposed method outperforms against all compared saliency detection methods including counterparts previous methods [9,10,11] across all five datasets. This result supports that the proposed method effectively balanced the relationship between precision and recall and also validates the robustness characteristics of the proposed method. Then, the performance of the proposed method is examined with compared previous saliency detection methods in terms of MAE across all five datasets. As clearly indicated in Fig. 7, the proposed method achieves improvement in MAE score than compared previous saliency detection methods on all five datasets. Thus, it justifies the concepts of dynamics region wise complementary features combination approach for saliency detection. Subsequently, the performance is analyzed in terms of AUC score of the proposed method and all the compared state-of-the-art saliency detection methods. As illustrated in Fig. 8, the proposed model outperforms against traditional compared saliency detection methods on all five datasets. Whereas, the proposed method achieves almost similar AUC score as C-PSO [9], BBO [10] and is comparable with SOFT [11].

Figure 9 presented ROC curve of the proposed model with compared state-of-the-art methods on all five datasets. It can be observed that the proposed model detection accuracy is consistently better than traditional compared saliency detection methods with varying thresholds on all five datasets. In contrast, It performs better than C-PSO [9], BBO [10] when the threshold is in between 0 to approx. 0.2 on DUT-OMRON [31], ECSSD [32], PASCAL-S [33] and when the threshold is in between 0 to approx. 0.1 on MSRA10K [30], after this the proposed method is comparable with C-PSO [9], BBO [10]. On SED2 [34] dataset, the proposed method outperforms against C-PSO [9], BBO [10] and SOFT [11] when the threshold is in between 0 to approx. 0.1, after this it is comparable with C-PSO [9], BBO [10] and SOFT [11]. The proposed method is comparable with SOFT [11] on MSRA10K [30], DUT-OMRON [31], ECSSD [32] and PASCAL-S [33] datasets. In general, the proposed model outperforms against traditional compared saliency detection methods in terms of Precision, Recall, F-measure, MAE, AUC and ROC on all five datasets. This superior performance of the proposed method supports the effectiveness of a regional combination of complementary visual features. Further, the proposed method achieves improvement in Precision, Recall, F-measure, MAE than counterparts previous methods [9,10,11]. Whereas, the performance of proposed model is comparable with C-PSO [9], BBO [10] and SOFT [11] in terms of AUC and ROC. These experimental results show that the proposed model has improved saliency detection performance due to the incorporation of region-wise feature combination with the complementary features which helps to uniformly highlight salient regions and effectively suppress background regions.

4.3 Discussion

Region-based feature combination method aims to combine basic complementary salient features for improving saliency detection performance. The key features of the proposed region-based feature combination method are selecting complementary salient features and region-wise their combination. These characteristics enable the proposed method to improve the accuracy and completeness of the salient object. The experimental results analyzed in Section 4.1 and 4.2 support the effectiveness of the proposed method. From Section 4.1, the variety of visual examples confirms the effective performance of the proposed method with challenging scenarios. As presented in Section 4.2, the Precision of the proposed method is higher than compared methods that confirm accurately finding the salient regions in the image. Similarly, the proposed method achieves significant improvement in Recall that all compared saliency detection methods that substantiate completeness characteristic of the proposed method. Further, the proposed method outperforms against compared methods in terms of F-measure which demonstrate the performance balance in Precision and Recall. This supports the robustness feature of the proposed method.

4.4 Computation time analysis

All methods are evaluated in following configuration: Intel(R)Core(TM)i7-4770 CPU@3.40GHz and 8GB RAM and computation time is tested on MSRA10K [30] as shown in Fig. 10. The proposed method is implemented in MATLAB. It can be observed from Fig. 10 that the proposed method requires less computational time than SOFT [11] and CA [17]. Whereas, it spends almost equal time as compared to C-PSO [9] and BBO [10].

5 Conclusion and future work

In this paper, a novel visual feature combination framework is proposed for salient object detection. This fusion approach exploited region-wise visual saliency characteristics to combine effectively various features extracted from natural images. The framework utilized the CPSO algorithm to find region-wise weight vectors and combined region-wise features according to learnt weights which generate a robust saliency map. Extensive experiments have been conducted to validate the efficacy of the proposed framework with eight state-of-the-art saliency detection methods on five publicly available saliency benchmark datasets. The experimental results show that the proposed framework outperforms or is comparable with state-of-the-art saliency detection methods in terms of several performance metrics. In future work, we will focus on finding effective salient features and a combination approach algorithm that improves saliency prediction performance as well as computationally efficient.

References

Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Patt Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303
Article MathSciNet Google Scholar
Zhang F, Du B, Zhang L (2014) Saliency-guided unsupervised feature learning for scene classification. IEEE Trans Geosci Remote Sens 53(4):2175–2184
Article Google Scholar
Shen X, Wu Y (2012) A unified approach to salient object detection via low rank matrix recovery. In: 2012 IEEE conference on computer vision and pattern recognition, p 853–860. IEEE
Liu N, Han J (2016) Dhsnet: Deep hierarchical saliency network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 678–686
Wang J, Jiang H, Yuan Z, Cheng M-m, Hu X, Zheng N (2017) Salient object detection: A discriminative regional feature integration approach. Int J Comput Vis 123(2):251
Article Google Scholar
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2010) Learning to detect a salient object. IEEE Trans Patt Anal Mach Intell 33(2):353–367
Google Scholar
Li X, Song D, Dong Y (2020) Hierarchical feature fusion network for salient object detection. IEEE Trans Image Process 29:9165–9175
Article Google Scholar
Singh N, Arya R, Agrawal R (2014) A novel approach to combine features for salient object detection using constrained particle swarm optimization. Pattern Recogn 47(4):1731–1739
Article Google Scholar
Wang Z, Wu X (2016) Salient object detection using biogeography-based optimization to combine features. Appl Intell 45(1):1–17
Article MathSciNet Google Scholar
Singh VK, Kumar N (2021) Soft: salient object detection based on feature combination using teaching-learning-based optimization. Signal, Image and Video Processing 15(8):1777–1784
Article Google Scholar
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S et al (2010) Slic superpixels. ecole polytechnique fédéral de lausssanne (epfl). Tech. Rep, 149300
Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era: An in-depth survey. IEEE Trans Patt Anal Mach Intell
Hou X, Zhang L (2007) Saliency detection: A spectral residual approach. In: Computer vision and pattern recognition, 2007. CVPR’07. IEEE Conference On, pp 1–8. IEEE
Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: A bayesian framework for saliency using natural statistics. J Vis 8(7):32–32
Article Google Scholar
Seo HJ, Milanfar P (2009) Static and space-time visual saliency detection by self-resemblance. J Vis 9(12):15–15
Article Google Scholar
Goferman S, Zelnik-Manor L, Tal A (2010) Context-aware saliency detection. In: 2010 IEEE computer society conference on computer vision and pattern recognition, p 2376–2383. IEEE
Murray N, Vanrell M, Otazu X, Parraga CA (2011) Saliency estimation using a non-parametric low-level vision model. In: Computer vision and pattern recognition (cvpr), 2011 Ieee Conference On, pp 433–440. IEEE
Zhu X, Tang C, Wang P, Xu H, Wang M, Chen J, Tian J (2018) Saliency detection via affinity graph learning and weighted manifold ranking. Neurocomputing 312:239–250
Article Google Scholar
Nouri F, Kazemi K, Danyali H (2018) Salient object detection using local, global and high contrast graphs. SIViP 12(4):659–667
Article Google Scholar
Wang Y, Peng G (2019) Salient object detection via incorporating multiple manifold ranking. SIViP 13(8):1603–1610
Article MathSciNet Google Scholar
Wang F, Peng G (2021) Graph-based saliency detection using a learning joint affinity matrix. Neurocomputing 458:33–46
Article Google Scholar
Simon D (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12(6):702–713
Article Google Scholar
Gao H, Xiao J, Yin Y, Liu T, Shi J (2022) A mutually supervised graph attention network for few-shot segmentation: the perspective of fully utilizing limited samples. IEEE Trans Neural Networks Learn Syst
Gao H, Xu K, Cao M, Xiao J, Xu Q, Yin Y (2021) The deep features and attention mechanism-based method to dish healthcare under social iot systems: An empirical study with a hand-deep local-global net. IEEE Trans Comput Soc Syst 9(1):336–347
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international Conference on Neural Networks, vol 4, pp 1942–1948. IEEE
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of the IEEE conference on CVPR, pp 2814–2821
Fu H, Cao X, Tu Z (2013) Cluster-based co-saliency detection. IEEE Trans Image Process 22(10):3766–3778
Article MathSciNet Google Scholar
Huang X, Zhang Y-J (2017) 300-fps salient object detection via minimum directional contrast. IEEE Trans Image Process 26(9):4243–4254
Article MathSciNet Google Scholar
Cheng MM, Mitra NJ, Huang SX, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Patt Anal Mach Intell 37(3):569–582
Article Google Scholar
Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 3166–3173
Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 1155–1162
Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 280–287
Alpert S, Galun M, Basri R, Brandt A (2007) Image segmentation by probabilistic bottom-up aggregation and cue integration. In: IEEE Conference on computer vision and pattern recognition, 2007, p 1–8. IEEE

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Sharda University, Greater Noida, Uttar Pradesh, India
Vivek Kumar Singh & Parma Nand
Department of Computer Science and Engineering, Punjab Engineering College, Chandigarh, India
Nitin Kumar

Authors

Vivek Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Nitin Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Parma Nand
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vivek Kumar Singh.

Ethics declarations

Conflicts of interest

All the authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Data availability

In this paper, experimental results have been generated on five publicly available salient object detection datasets viz. MSRA10K [30], DUT-OMRON [31], ECSSD [32], PASCAL-S [33] and SED2 [34]. All these datasets are publicly available and the references for the same have been given in the text.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Singh, V.K., Kumar, N. & Nand, P. Region-based feature combination for robust salient object detection. Multimed Tools Appl 83, 35159–35174 (2024). https://doi.org/10.1007/s11042-023-17083-1

Download citation

Received: 06 July 2022
Revised: 20 June 2023
Accepted: 15 September 2023
Published: 28 September 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-17083-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Region-based feature combination for robust salient object detection

Abstract

Similar content being viewed by others

U-FIN: Unsupervised Feature Integration Approach for Salient Object Detection

A novel hybrid approach for salient object detection using local and global saliency in frequency domain

Thresholding in salient object detection: a survey