1 Introduction

Salient object detection (SOD) aims at distinguishing the most significant objects in a given image and is helpful to well understand the image scene. This process describes the characteristics of objects or regions of a digital image to attract human attention. Typically, SOD methods take a digital image as input and generate a probability map called saliency map [1]. This map describes salient and non-salient objects or regions in the image which is applied as a pre-processing task in various computer vision applications such as image retrieval [2], image scene understanding [3], etc. Over a few decades, various saliency detection approaches have been suggested to identify salient objects or regions in natural images. In general, these approaches are broadly classified into two categories: bottom-up approaches [1, 4] and top-down approaches [5, 6]. In order to accomplish the saliency detection process in an effective manner, the method should be able to discriminate the salient object from complex background. Visual data caries various effective features that play an important role to grab human visual attention from complex backgrounds.

Feature combination is one of the most effective approach to improve saliency detection. There are many existing research works about the visual features combination such as [7, 8], which have improved performance of salient object detection. Typically, salient object detection methods combine features that complement each other such that one feature identify saliency in some regions while others capture saliency in remaining regions. Based on this features combination approach, a conditional random field (CRF) based feature combination method was introduced by Liu et al. [7]. They also extracted novel salient features from the input image namely center-surround histogram, multi-scale contrast and color spatial distribution that enable the method to capture saliency value regionally, locally, and globally. Some methods [9,10,11] tried to address the salient feature combination based on metaheuristic search algorithms that effectively optimize feature combination weights to define an appropriate ratio of the salient features in terms of learned weights. These methods learn a set of weights by using different metaheuristic search algorithms and combine the visual features extracted from the input image uniformly in entire image regions for saliency detection. However, the significance of visual features to describe saliency value may be varying across different regions. Thus, These methods may fail to find optimal feature weights for feature combination that dynamically combined image regions which degrade saliency detection performance in a cluttered and complex background. This shortcoming arises because these feature combination approaches do not consider the regional importance during feature combination.

Motivation

In order to better handle the combination of salient features for saliency detection, region wise feature combination strategy is required for integrating regionally most important salient features. In this strategy, feature combination weights are learned for each region instated of the entire image. Such approaches regionally combined salient features with dynamic weights that to improve saliency detection performance even in complex images. For better understanding, several visual cases are shown in Fig. 1, where the third column from left (i.e., Fig. 1 (c)) shows the saliency results of our region-based features combination method and fourth, fifth and sixth columns (i.e., Fig. 1 (d), (e), and (f) respectively ) illustrate the saliency results of previous metaheuristic based feature combination methods such as Constrained Particle swarm optimization (CPSO) [9], Biogeography-based optimization (BBO) [10] and SOFT [11]. The figure clearly shows that the proposed method achieves a significant improvement over state-of-the-art saliency methods [9,10,11]. Inspired by the above advantage, we propose a novel feature combination approach which considers the region based importance of visual features for detecting the salient object.

Fig. 1
figure 1

Illustration of motivation behind the proposed model. (a) input images and (b) ground truths. Saliency maps generated by (c) proposed model and state-of-the-art methods (d) CPSO [9], (e) BBO [10] and (f) SOFT [11]

In this paper, we propose a salient object detection method that combines various salient features based on image regions which improve the performance of saliency detection. Initially, the input image is partitioned into meaningful homogeneous regions by using simple linear iterative clustering (SLIC) [12]. Then, several salient features are extracted from the input image based on the criteria of complementary characteristics of various salient features. Such a feature selection approach increases the performance of the feature combination method due to each feature complementing each other such that computes better saliency values in different regions of the image. Afterwards, a metaheuristic optimization algorithm is utilized to learn region-wise feature weights. This weight learning step provides a set of weights for each feature in different regions that describe the dynamic ratio of various features across the image region wise. Lastly, salient features are dynamically combined with the help of learnt weights over the image in a region wise manner. The proposed method highlights salient regions uniformly even in a cluttered and complex background and improves the saliency detection performance due to its ability to capture the region-wise importance of salient features. Here, our important contributions are summarized as follows:

  1. 1.

    We have proposed a novel feature combination approach which combines various visual features based on image regions.

  2. 2.

    This method improves saliency detection performance in cluttered and complex image scenes. To the best of our knowledge, the region-based feature combination approach has not yet been used in saliency detection.

  3. 3.

    A metaheuristic optimization algorithm has been employed region-wise to optimize the contribution of visual features for saliency detection.

  4. 4.

    Empirical results demonstrate that the proposed model consistently achieves better performance than eight state-of-the-art saliency detection methods in terms of several performance measures over five publicly available benchmark datasets.

The rest of this paper is organized as follows: Section 2 describes related work of saliency detection in brief. Section 3 provides the details of the proposed method. In Section 4, experimental results are presented and analyzed. Finally, conclusion and future work are stated in Section 5.

2 Related work

In the past few decades, an increasing number of salient object detection methods have been suggested and reported a significant improvement in the saliency detection performance [13]. First, Itti et al. [1] introduced a computational saliency detection method which extracted hand-crafted visual features at multiple scales and linearly combined them to generate a saliency map. Hou and Zhang [14] discovered the spectral residual based saliency approach and its general ability to detect proto-objects. Zhang et al. [15] proposed a bottom-up saliency method that exploited natural statistics based on Bayesian framework to detect salient objects. Seo and Milanfar [16] exploited self-resemblance approach for predicting saliency. Goferman et al. [17] suggested saliency prediction by considering four principles of human attention which is a context-aware saliency detection approach. Murray et al. [18] introduced an effective saliency detection method based on color appearance in human vision.

Further, many graph-based saliency detection methods [19,20,21,22] achieve promising performance. Zhu et al. [19] suggested a saliency prediction approach via affinity graph learning and weighted manifold ranking. Nour et al. [20] proposed a novel multi-graph-based method for salient object detection, in which an edge weight matrix is constructed by utilizing color, spatial and background labels. Wang et al. [21] presented a saliency detection via incorporating multifeature-based boundary ranking and boundary connectivity ranking. Wang et al. [22] suggested a graph-based saliency detection via a learning joint affinity matrix. In consideration of feature combination based saliency detection approaches, Liu et al. [7] suggested a saliency detection method which extracts three novel features and trained a conditional random field model to learn a set of weights for feature combination.

In recent years, the extensive application of metaheuristic optimization algorithms in solving various optimization problems has unfolded new ideas for feature combination in salient object detection. In this direction, some feature fusion approaches have been suggested via metaheuristic optimization algorithms. Singh et al. [9] introduced a novel features combination approach based on Constrained Particle swarm optimization (C-PSO). Wang et al. [23] presented a visual feature fusion framework via Biogeography-based optimization (BBO) [23] and its variants metaheuristic optimization algorithms. The method utilized the fitness function suggested in Singh et al. [9] model and feature maps are extracted using Liu et al. [7] model. In [11], an efficient features combination framework has been proposed based on Teacher-learning-based Optimization (TLBO). The method introduced a novel fitness function to effectively optimize TLBO learning parameters. In many researches, deep learning based feature combination methods are employed to improve the performance of significant visual region detection. Li et al. [8] introduced a hierarchical Feature Fusion Network (HFFNet) that fused features hierarchically to extract high-level semantic information and low-level edge information. Gao et al. [24] suggested a mutually supervised few-shot segmentation network that combined visual features for image segmentation. In [25], local and global attention mechanisms are combined for dish image recognition.

3 Proposed method

In this section, we present the proposed method, based on region wise dynamic feature combination for salient object detection. This method dynamically combined various salient features on different image regions. The goal is to provide appropriate weight for each feature region wise that can capture better saliency value from each region during the combination of salient features. Therefore, the proposed method is robust in finding the salient object in various challenging natural image scenarios. The purpose of robustness in salient object detection is to consistently maintain performance with various challenging natural image scenarios such as cluttered background, heterogeneous foreground and low contrast between background and foreground, etc. The key feature of the proposed method is region wise visual feature combination which dynamically combined basic visual features according to their importance in various image regions. In this method, an input image is segmented into homogeneous regions that preserve all the significant visual attention cues while removing destructive visual information. Then, various salient features are extracted from the image based on complementary characteristics. These features are redefined corresponding to each region that describes similar features for each image element of the regions. Then, a nature inspirited optimization algorithm called Constrained Particle swarm optimization (CPSO), which is a modified nature inspirited optimization algorithm form of Particle swarm optimization (PSO) [26] is employed for learning regional weights. Here, the selection of CPSO is based on measuring saliency results quality and computational time [11]. Afterwards, salient features are regionally combined through learn weights and generate high-quality saliency maps.

As illustrated in Fig. 2, the proposed method consists of four steps including region formation, feature extraction, region-wise weight learning and region-based feature combination. In region formation, the input image is partitioned into meaningful regions and some selected feature extraction techniques suggested in state-of-the-art saliency detection methods are employed to extract the feature in the feature extraction step. In the third step, region-wise weight vectors are learnt using a nature inspirited optimization algorithm. Lastly, extracted features are combined using learnt weight vectors for generating the final saliency map.

Fig. 2
figure 2

Flowchart of the proposed model. The features \({\textbf {f }}=\{{\textbf {f }}_1\), \({\textbf {f }}_2\), \({\textbf {f }}_3\}\) are extracted from input image using [27, 28], and [29] existing saliency methods respectively. The learnt weight is denoted by \(\hat{\theta } \)

3.1 Region formation

An image region can be defined as a set of pixels that contain homogeneous features. The proposed model employs simple linear iterative clustering (SLIC) [12] to partition input image into meaningful regions. This pre-processing breaks the input image I into n regions, which is defined as \(R=\{1, 2, ...,n\}\).

3.2 Feature extraction

Visual feature plays a significant role in visual attention formulation. In this model, three visual features are identified from the state-of-the-art salient object detection methods and represented as (\({\textbf {f }}_1, {\textbf {f }}_2, {\textbf {f }}_3\)). The feature \({\textbf {f }}_1\) is extracted from input image by using a background measure based on Boundary Connectivity (BC) [27], which represents a highly reliable background knowledge. The Contrast Cluster (CC) [28] is extracted and represented as \({\textbf {f }}_2\) that captures distinctive visual feature from the input image. Similarly, \({\textbf {f }}_3\) is extracted as a Minimum Directional Contrast (MDC) [29], which represents different spatial direction contrasts.

3.3 Region-wise weight learning

In the proposed model, the extracted features are combined for salient object detection. There are different combination techniques such as linear and weighted linear which can be employed on the features for computing salient regions. The proposed feature combination approach combine features with a weighted linear manner. This approach requires an appropriate weight vector for a weighted linear combination. Here, Constrained Particle swarm optimization (CPSO) which is a modified nature inspirited optimization algorithm form of Particle swarm optimization (PSO) [26] is used to learn weights for each image. In [9], CPSO is applied to learn a weight vector which is used to combine the entire image visual features for salient object detection. In the proposed model, the CPSO is utilized in a different manner to enhance the effectiveness of the feature combination approach. In [9], a weight vector is learnt for the entire image and regions wise importance is not captured. The proposed model address the drawback by learning a weight vector for each meaningful image region which improves saliency detection performance.

Let \(\varvec{\theta }= \{\varvec{\theta }_{1}, \varvec{\theta }_{2}, ..., \varvec{\theta }_{n}\}\) is the weight vector employed to combine given image features, where \(\varvec{\theta }_{n}=\{\theta _{n_1}, \theta _{n_2}, ..., \theta _{n_d}\}\) be the weight vector for n-th image region. To obtain the optimal weight vector \(\varvec{\theta }\), CPSO is employed region wise on input image. Further, a fitness function similar to  [9] is used in the proposed model. Initially, the visual features of n-th image region are integrated by using the weight vector \(\varvec{\theta }_{n}\) as follows:

$$\begin{aligned} \textbf{S}(\varvec{\theta }_{n})=\sum _{d=1}^{N} \theta _{n_d} \times \mathbf {\textit{f}}_{n_d} \end{aligned}$$
(1)

where \(\mathbf {\textit{f}}_{n_d}\) is d-th feature for n-th region, N is number of features and \(\theta _{n_d}\) is a weight for d-th feature in n-th region. The initial saliency map \(\textbf{S}(\varvec{\theta })\) is represented as follows:

$$\begin{aligned} \textbf{S}(\varvec{\theta })=\{\textbf{S}(\varvec{\theta }_{1}), \textbf{S}(\varvec{\theta }_{2}), ..., \textbf{S}(\varvec{\theta }_{n})\} \end{aligned}$$
(2)

The initial saliency map \(\textbf{S}(\varvec{\theta })\) is partitioned into two regions namely the salient region and non-salient region by applying an adaptive threshold which is calculated in two steps. First, canny edge operator is applied on the \(\textbf{S}(\varvec{\theta })\) to obtain the edge map \(\textbf{E}(\varvec{\theta })\) as follows [9]:

$$\begin{aligned} \textbf{E}(\varvec{\theta })={\left\{ \begin{array}{ll} 1 &{} \text { if } pixel \in edge(\textbf{S}(\varvec{\theta })) \\ 0 &{} \textit{otherwise} \end{array}\right. } \end{aligned}$$
(3)

Then, the saliency values at the edge map \(\textbf{E}(\varvec{\theta })\) are obtained and mean is computed which is called threshold \(\eta (\varvec{\theta })\) as follows [9]:

$$\begin{aligned} \eta (\varvec{\theta })=\frac{\sum _{p \in P_{I}}\textbf{E}(p,\varvec{\theta }). \textbf{S}(p,\varvec{\theta })}{\sum _{p \in P_{I}}\textbf{E}(p,\varvec{\theta })} \end{aligned}$$
(4)

where \(p_{I}\) denotes the set of image pixel. The binary map \(\textbf{B}(\varvec{\theta })\) is computed by applying the threshold value \(\eta (\varvec{\theta })\) on the \(\textbf{S}(\varvec{\theta })\) as follows [9]:

$$\begin{aligned} \textbf{B}(\varvec{\theta })={\left\{ \begin{array}{ll} 1 &{} \text { where } \textbf{S}(\varvec{\theta }) \ge \eta (\varvec{\theta }) \\ 0 &{} \textit{otherwise} \end{array}\right. } \end{aligned}$$
(5)

In visual attention, the contribution of salient regions and non salient regions should be maximized and minimized, simultaneously. Consequently, the fitness function \(fit (\varvec{\theta })\) is defined as follows [9]:

$$\begin{aligned} fit (\varvec{\theta })= \sum _{p \in P_{sal}}(1-\textbf{S}(p,\varvec{\theta }))+ \sum _{p \in P_{nonsal}}\textbf{S}(p,\varvec{\theta }) \end{aligned}$$
(6)

where \(P_{sal}\) and \(P_{nonsal}\) denote the set of salient pixels and non salient pixels, which is obtained by jointly considering \(\textbf{S}(\varvec{\theta })\) and \(\textbf{B}(\varvec{\theta })\). For optimizing the weight \(\varvec{\theta }\), the \(fit (\varvec{\theta })\) is to be minimized and optimized weight is represented as \(\hat{\varvec{\theta }}\).

3.4 Region-based feature combination

The weight vector \(\hat{\varvec{\theta }}\) learnt by using CPSO is used to combine the extracted visual features \({\textbf {f }}=\{{\textbf {f }}_1\), \({\textbf {f }}_2\), \({\textbf {f }}_3\}\) region wise to generate final saliency map \({\textbf {S}}\). For n-th region the saliency map \({\textbf {S}}(\hat{\varvec{\theta }}_{n})\) is computed as follows:

$$\begin{aligned} \textbf{S}(\hat{\varvec{\theta }}_{n})=\sum _{d=1}^{N} \hat{\varvec{\theta }}_{n_d} \times \textbf{ f }_{n_d} \end{aligned}$$
(7)

The saliency map \({\textbf {S}}\) is obtained as follows:

$$\begin{aligned} \textbf{S}=\{\textbf{S}(\hat{\varvec{\theta }}_{1}), \textbf{S}(\hat{\varvec{\theta }}_{2}), ..., \textbf{S}(\hat{\varvec{\theta }}_{n})\} \end{aligned}$$
(8)

4 Experimental results

In order to validate efficacy of the proposed model, we have conducted extensive experiments on five publicly available datasets viz. MSRA10K (10,000 images) [30], DUT-OMRON (5,168 images) [31], ECSSD (1,000 images) [32], PASCAL-S (850 images) [33] and SED2 (100 images)[34] against eight saliency detection methods which includes three metaheuristic feature fusion methods namely, C-PSO [9], BBO [10] and SOFT [11] and five traditional methods such as SR [14], SUN [15], SeR [16], CA [17], and SIM [18]. For the quantitative evaluation, six widely applied evaluation metrics including Precision, Recall, Receiver Operating Characteristics (ROC), F-Measure, Area Under the Curve (AUC) and Mean absolute error (MAE) are employed to evaluate the performance of the proposed model with the compared state-of-the-art saliency methods.

Fig. 3
figure 3

Examples of visual comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

4.1 Qualitative comparison

To show the effectiveness of the proposed method against compared saliency detection methods, a visual comparison is presented in Fig. 3. It is to be noted all traditional compared saliency detection methods namely SR [14], SUN [15], SeR [16], CA [17], and SIM [18] are facing difficulties to generate uniformly highlighted salient regions in saliency maps on almost all images as shown in Fig. 3. On the other hand, the proposed model generates high-quality saliency maps not only on a simple background or homogeneous objects image such as columns 1, 5 and 7 but also on complex and cluttered background images such as columns 2, 3, 4 and 5. The reason for this response is the combination of various salient features which are complement to each other in spatially heterogeneous visual regions.

In addition, Fig. 3 presents the visual comparison of the proposed method with state-of-the-art feature combination methods [9,10,11] based on metaheuristic optimization algorithm for saliency detection. It can be easily seen that the C-PSO [9] captures salient objects with unnecessary background information which degraded its performance, BBO [10] performs similarly with a slight improvement on challenging images such as columns 3 and 5. The SOFT [11] method generates good quality saliency maps but it also fails to remove background noises which impact the saliency detection performance. These visual samples of saliency maps validate that previous works fail to suppress the background noises and uniformly highlight salient regions which reduce their performance. However, it can be easily observed from Fig. 3 that the proposed region-wise feature combination method significantly improves the saliency detection performance than counterparts in previous works due to its dynamic feature combination strategy. As illustrated in Fig. 3, the proposed method uniformly highlights salient regions and effectively suppresses unnecessary background information in various challenging scenario images. The superior performance of the proposed method confirms its robust characteristic in handling various types of challenging natural images such as cluttered backgrounds and heterogeneous foregrounds. Overall, qualitative results validate that the proposed model provides a clear separation between foreground and background even in complex images which validates the robust feature of the proposed model.

Fig. 4
figure 4

Precision comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

4.2 Quantitative comparison

For better understanding, we have also conducted quantitative experiments of the proposed model with traditional compared saliency detection methods and counterparts previous methods. Figure 4 shows Precision scores of the proposed method and state-of-the-art compared saliency detection methods on all five datasets. From Fig. 4, it can easily observe that the proposed method significantly improves the Precision score from all the compared methods including C-PSO [9], BBO [10] and SOFT [11] on all datasets. This experimental result confirms that the proposed method improves detection accuracy more than all the compared methods. The reasons for this outcome of the proposed method are effective selection and the combination of basic salient features. Therefore, this validates the proposed dynamic region-wise feature combination approach and complementary features selection technique. Figure 5 provides Recall scores of the proposed method and state-of-the-art compared saliency detection methods on all five datasets. As illustrated in the Fig. 5, the proposed method outperforms against all compared saliency detection methods including counterparts previous methods [9,10,11] across all five datasets. This observation validates that the proposed method significantly improves the performance of saliency detection in terms of completeness which refer to completely capturing salient objects present in the natural image. Such performance supports the incorporation of complementary salient features which complement each other in visually heterogeneous regions to maximize the identified saliency regions.

Fig. 5
figure 5

Recall comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

Fig. 6
figure 6

F-measure comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

Further, the superior performance of the proposed method against compared saliency detection methods in terms of F-measure on all five datasets is shown in Fig. 6. It is worth noting that the proposed method outperforms against all compared saliency detection methods including counterparts previous methods [9,10,11] across all five datasets. This result supports that the proposed method effectively balanced the relationship between precision and recall and also validates the robustness characteristics of the proposed method. Then, the performance of the proposed method is examined with compared previous saliency detection methods in terms of MAE across all five datasets. As clearly indicated in Fig. 7, the proposed method achieves improvement in MAE score than compared previous saliency detection methods on all five datasets. Thus, it justifies the concepts of dynamics region wise complementary features combination approach for saliency detection. Subsequently, the performance is analyzed in terms of AUC score of the proposed method and all the compared state-of-the-art saliency detection methods. As illustrated in Fig. 8, the proposed model outperforms against traditional compared saliency detection methods on all five datasets. Whereas, the proposed method achieves almost similar AUC score as C-PSO [9], BBO [10] and is comparable with SOFT [11].

Fig. 7
figure 7

MAE comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

Fig. 8
figure 8

AUC comparison of the proposed model with the state-of-the-art methods on five benchmark datasets

Fig. 9
figure 9

ROC on the five datasets: (a) MSRA10K [30], (b) DUT-OMRON [31], (c) ECSSD [32], (d) PASCAL-S [33] and(e) SED2 [34]

Figure 9 presented ROC curve of the proposed model with compared state-of-the-art methods on all five datasets. It can be observed that the proposed model detection accuracy is consistently better than traditional compared saliency detection methods with varying thresholds on all five datasets. In contrast, It performs better than C-PSO [9], BBO [10] when the threshold is in between 0 to approx. 0.2 on DUT-OMRON [31], ECSSD [32], PASCAL-S [33] and when the threshold is in between 0 to approx. 0.1 on MSRA10K [30], after this the proposed method is comparable with C-PSO [9], BBO [10]. On SED2 [34] dataset, the proposed method outperforms against C-PSO [9], BBO [10] and SOFT [11] when the threshold is in between 0 to approx. 0.1, after this it is comparable with C-PSO [9], BBO [10] and SOFT [11]. The proposed method is comparable with SOFT [11] on MSRA10K [30], DUT-OMRON [31], ECSSD [32] and PASCAL-S [33] datasets. In general, the proposed model outperforms against traditional compared saliency detection methods in terms of Precision, Recall, F-measure, MAE, AUC and ROC on all five datasets. This superior performance of the proposed method supports the effectiveness of a regional combination of complementary visual features. Further, the proposed method achieves improvement in Precision, Recall, F-measure, MAE than counterparts previous methods [9,10,11]. Whereas, the performance of proposed model is comparable with C-PSO [9], BBO [10] and SOFT [11] in terms of AUC and ROC. These experimental results show that the proposed model has improved saliency detection performance due to the incorporation of region-wise feature combination with the complementary features which helps to uniformly highlight salient regions and effectively suppress background regions.

4.3 Discussion

Region-based feature combination method aims to combine basic complementary salient features for improving saliency detection performance. The key features of the proposed region-based feature combination method are selecting complementary salient features and region-wise their combination. These characteristics enable the proposed method to improve the accuracy and completeness of the salient object. The experimental results analyzed in Section 4.1 and 4.2 support the effectiveness of the proposed method. From Section 4.1, the variety of visual examples confirms the effective performance of the proposed method with challenging scenarios. As presented in Section 4.2, the Precision of the proposed method is higher than compared methods that confirm accurately finding the salient regions in the image. Similarly, the proposed method achieves significant improvement in Recall that all compared saliency detection methods that substantiate completeness characteristic of the proposed method. Further, the proposed method outperforms against compared methods in terms of F-measure which demonstrate the performance balance in Precision and Recall. This supports the robustness feature of the proposed method.

4.4 Computation time analysis

All methods are evaluated in following configuration: Intel(R)Core(TM)i7-4770 CPU@3.40GHz and 8GB RAM and computation time is tested on MSRA10K [30] as shown in Fig. 10. The proposed method is implemented in MATLAB. It can be observed from Fig. 10 that the proposed method requires less computational time than SOFT [11] and CA [17]. Whereas, it spends almost equal time as compared to C-PSO [9] and BBO [10].

Fig. 10
figure 10

Running time(sec.) comparison of the proposed model with the state-of-the-art methods on MSRA10K [30] dataset

5 Conclusion and future work

In this paper, a novel visual feature combination framework is proposed for salient object detection. This fusion approach exploited region-wise visual saliency characteristics to combine effectively various features extracted from natural images. The framework utilized the CPSO algorithm to find region-wise weight vectors and combined region-wise features according to learnt weights which generate a robust saliency map. Extensive experiments have been conducted to validate the efficacy of the proposed framework with eight state-of-the-art saliency detection methods on five publicly available saliency benchmark datasets. The experimental results show that the proposed framework outperforms or is comparable with state-of-the-art saliency detection methods in terms of several performance metrics. In future work, we will focus on finding effective salient features and a combination approach algorithm that improves saliency prediction performance as well as computationally efficient.