1 Introduction

The visual image quality is reduced by distortions upon data acquisition, compression, storage, and transmission processes. Because the subjective quality assessment is a complex and time-consuming task, it is important to develop reliable and accurate objective methods.

Full reference image quality assessment (FR-IQA) methods promote good performance in terms of high correlation to the subjective quality assessment. SSIM [24], MS-SSIM [23], CW-SSIM [19], FSIM [28], VGS [31], VIF [20] and VSNR [1] are examples of well-known FR-IQA methods. The reference image is needed to predict the quality in these methods. In many applications, the reference image is not available; therefore no-reference methods are needed.

In no reference image quality assessment methods (NR-IQA), evaluating the image quality is performed only with the distorted image and no additional information is available. The purpose of NR-IQA methods is to estimate the image quality that highly correlated to the subjective scores. NR-IQA methods can be utilized in the applications such as quality of service (QoS) monitoring. When the visual data is transmitted, due to the lossy compression, packet loss, noise and other distortions, the quality of delivered content is degraded. Since the reference image is not available at the receiver site, thus NR-IQA methods are required. The NR-IQA methods can be divided into two categories of distortion specific and general purpose methods. In distortion specific methods, the quality assessment is performed for a particular type of distortion. For example, In [25] a no-reference metric is proposed that estimates the blocking artifacts in DCT coded images. In [3] the blur assessment is performed by designing classifiers based on neural network architectures. Low-level blur metrics and image features are used as the input of the classifiers. A perceptual-based sharpness/blurriness metric based on the concept of just noticeable blur (JNB) is also introduced in [6]. A blur metric that performs edge detection and utilizes a probabilistic model to estimate the probability of detecting blur at the edges is presented in [15]. The metric is created by computing the cumulative probability. Quality evaluation of JPEG and JPEG2000 images is another class of distortion specific methods. In [10], quality estimation of JPEG2000 compressed images is performed by utilizing the statistical information of the gradient profiles along the edges. In [27], an activity map based on monotone-changing pixels and zero-crossing pixels is created within non-overlapping image blocks. Then by a pooling approach, the activity map is combined to a single quality score. A kurtosis based image quality metric in the DCT domain for JPEG2000 images is developed in [29]. Estimating the quality of JPEG images is accomplished in [8] by counting the zero-valued of DCT coefficients within each block and weighting by a quality relevance map to form the quality degradation metric.

Most of the general purpose methods follow one of these two trends of natural scene statistics and learning based methods. Evaluating the image quality according to the natural scene statistics is based on this fact that natural images contain certain statistical properties that vary with the distortion. DIIVINE [14] and BIQI [13] methods employ statistical characteristic of wavelet coefficients for predicting the quality. In BLIINDS [17] and BLIINDS-II [18] the quality aware features are extracted from the statistical properties of DCT coefficients. In BRISQUE [12], no transform is used, and the distribution of locally normalized luminance coefficients is used to design quality features. GLBP [30] utilizes local binary pattern (LBP) [16] statistics in Laplacian of Gaussian (LOG) space to estimate the quality. In contrast to natural scene statistics methods, the learning based methods extract the quality features from a machine learning process. Among different learning base methods, LBIQ [21], GRNN [9] and CBIQ [26] are shown successful. In [26] first a codebook is created and used to encode Gabor features. Then average pooling is employed to form the final feature vector. Although some proposed NR-IQA methods could reach the consistent results with human evaluations of quality but there is still further work to be done in order to have accurate methods with low complexity that can be used in real time IQA applications.

In this paper, a new method for NR-IQA is presented. In this method the wavelet subbands are encoded with LBP operator individually and their histograms are concatenated and used as quality features. In our method no codebook is required and quality features are directly extracted from subbands of wavelet transform by LBP operator. When our algorithm is tested on LIVE and TID2008 databases, a good correlation to the subjective scores is obtained, which is also competitive to the state-of-the-art methods. Our method also has low complexity and suitable for real-time NR-IQA applications.

The reminder of this paper is organized as follows. Local binary pattern is briefly introduced in section 2. The proposed method will be explained in section 3. The experimental results are presented in section 4. Finally section 5 concludes the paper.

2 Local binary pattern

LBP operator is used to describe the texture of the image [16]. Because of low complexity and discriminative capability, LBP is used in many image processing applications. LBP is one of the most widely used local texture descriptor due to robust in illumination variation, low computational complexity and the ability to code the fine details. Now a variety of LBP operator is presented. LBP code for each pixel is generated by comparing neighboring pixels with the center pixel.

$$ {LBP}_{P, R}=\sum_{p=0}^{P-1} s\left({g}_p-{g}_c\right){2}^p,\mathrm{s}\left(\mathrm{x}\right)=\left\{\begin{array}{c}\hfill 1,\kern0.75em x\ge 0\hfill \\ {}\hfill 0,\kern0.75em x<0\hfill \end{array}\right. $$
(1)

In the formula g P represents neighboring pixels and the central pixel is represented as g c . P specifies the number of neighboring pixels and R is the radius of the neighborhood. If the neighboring pixel gray level is greater than or equal to center pixel, it’s encoded as 1, otherwise encoded with 0. For each pixel the decimal value of the binary codes is computed and named as LBP code of the pixel. Then the histogram [5] of LBP codes is computed and is used as a feature vector. See Fig. 1 for calculating the LBP code. LBP code is called uniform if there is a maximum of two bitwise transitions from 1 to 0 or vice versa in the binary code when the LBP code is considered circular. These patterns contain more information than others. Ojala et al. noticed on their experiments with texture images that about 90% of the LBP patterns are uniform by applying LBP 8 , 1 [16]. If all uniform patterns are kept and a binary code is assigned to all non-uniform, the uniform LBP code is generated. In this case, a smaller number of LBP codes will be required and the histogram vector becomes smaller. The number of uniform patterns is calculated as follows.

$$ U\left({LBP}_{P, R}\right)=\left|\left( s\left({g}_{P-1}-{g}_c\right)- s\left({g}_0-{g}_c\right)|+\sum_{p=1}^{P-1}|\ s\left({g}_p-{g}_c\right)- s\Big({g}_{p-1}-{g}_c\right)\right| $$
(2)
Fig. 1
figure 1

An example of calculating LBP code for a pixel

For obtaining the rotation-invariant code first the LBP code is computed and then all of the circularly right shift codes must be calculated. Then the minimum decimal value of these p-1 codes is introduced as rotation invariant LBP code. In this case there will be less binary codes to describe the texture. For example for p = 8 only 36 rotation invariant binary codes is required.

$$ {LBP}_{P, R}^{ri}= \min \left\{ ROR\left({LBP}_{P, R}, i\right)\ | i=0,1,2,\dots, P-1\right\} $$
(3)

ROR(LBP P , R , i) performs a circular bit-wise right shift i times on the LBP code. The code which has a minimum decimal value is selected as rotation invariant LBP code. Uniform and rotation invariant LBP code is also computed as follows.

$$ {LBP}_{P, R}^{riu2}=\left\{\begin{array}{c}\hfill \sum_{p=0}^{P-1} s\left({g}_p-{g}_c\right)\kern1em if\ U\left({LBP}_{P, R}\right)\le 2\hfill \\ {}\hfill P+1\kern6.25em otherwise\kern3.25em \hfill \end{array}\right. $$
(4)

Superscript riu2 reflects the rotation invariant and uniform patterns with a U value of at most two. Finally, by calculating the occurrence histogram, the texture feature is made.

$$ H(k)=\sum_{m=1}^M\sum_{n=1}^N f\left( LBP\left( m, n\right), k\right)/ MN, k\in \left[0, K\right], f\left( x, y\right)=\left\{\begin{array}{c}\hfill 1,\kern1em x= y\kern4.25em \hfill \\ {}\hfill 0,\kern1em otherwise\kern2em \hfill \end{array}\right. $$
(5)

The introduced LBP code is the basic variant of this operator and various types of LBP for different applications have been proposed. In this paper LBP operator is applied to wavelet transform coefficients. LBP operator is often applied on image patches and rarely used for wavelet coefficients. To describe the texture of wavelet coefficients due to their difference with natural images, it is required to employ a certain type of LBP. For encoding the structure of wavelet coefficients the magnitude of center coefficient in a 3×3 block is also considered in the calculation of the LBP code. In this case the structure of wavelet coefficients will be better described. For this purpose, the center coefficient is compared with a threshold and encoded with a bit and concatenated to \( {LBP}_{P, R}^{ri} \) code. The threshold is set as the average value of coefficients in a subband (c sub ).

$$ LBP\_{C}_{P, R}= s\left({g}_c-{c}_{sub}\right) $$
(6)

Another point that should be considered is choosing the threshold in the computation of LBP code. In the basic LBP, the comparison threshold is 0, but for precise characterization of local structure it is better to choose a different value.

$$ {LBP}_{P, R, T}=\sum_{p=0}^{P-1} s\left({g}_p-{g}_c\right){2}^p,\mathrm{s}\left(\mathrm{x}\right)=\left\{\begin{array}{c}\hfill 1,\kern0.75em x\ge T\hfill \\ {}\hfill 0,\kern0.75em x< T\hfill \end{array}\right. $$
(7)

The introduced LBP is called Wavelet Local Binary Pattern (WLBP). Fig. 2 shows the WLBP code of LL1 wavelet subband.

Fig. 2
figure 2

A visual example of WLBP codes related to LL1 sub-band for a LIVE databse image

3 The proposed method

The proposed method for blind image quality assessment consists of the following steps. Figure 3 illustrate the block diagram of the proposed method.

Fig. 3
figure 3

Block diagram of the proposed method

3.1 Wavelet decomposition

Wavelet transform is one of the most important tools of signal representation. It has been used in image processing, data compression, and signal processing. In some applications, we need to know the frequency and spatial information at the same time in different resolution. It can be shown that by using wavelet transform both frequency and spatial information is provided. By applying one level of two dimensional discrete wavelet transform (2D–DWT), the image is decomposed into four subbands. The decomposition is performed by passing the image through two complementary filters named approximation and details. These subbands contain different frequency characteristics. The high-pass filter extracts the high frequency part and the low-pass filter gives the low frequency information representing the most energy of an image. Approximation subband (LL) contains low frequency information of the image. The three remaining are called details. LH and HL subbands represent the horizontal and vertical edges of the image. HH subband contains high frequency information and also displays diagonal edges of the image.

Figure 4 shows the image decomposition by one level of wavelet transform. In this paper, feature extraction is performed in the wavelet domain. The choice of the wavelet transform was motivated by the orientation and spatial frequency selectivity of the human visual system (HVS). Studies on the functional properties of primary visual cortex prove the existence of neurons that are sensitive to orientation and spatial frequency [11] [4] [7]. These multi-channel properties of the HVS can be modeled by wavelet transform. By introducing distortions to the image the statistical properties of the wavelet coefficients are changed. Hence by estimating these statistical properties, the quality aware features can be extracted for predicting the image quality. In this paper we extract the structural information of these coefficients by LBP operator instead of modeling them with a generalized Gaussian distribution (GGD).

Fig. 4
figure 4

Single level wavelet decomposition applied to a LIVE database image

In the proposed method one level of biorthogonal wavelet transform is applied to the gray scaled image. Thus the image is decomposed into four subbands. Since every subband contains different frequency and orientation information, the LBP operator is applied to each subband separately.

3.2 LBP encoding of subband coefficients

Introducing distortions to an image lead to changes in local structure. LBP operator can encode the local structure and occurrence histogram of uniform local binary pattern codes is a very strong texture feature. Since the wavelet transform can be considered as a model of HVS, employing LBP in the wavelet domain is more discriminative for quality prediction.

In the proposed method LBP 4 , 1 is used. For encoding 4 neighboring pixels both \( {LBP}_{4,1}^{ri} \) and \( {LBP}_{4,1}^{riu2} \) are the same. However if more neighboring pixels are needed, it’s better to use non uniform LBP. Distorted images contain more non-uniform patterns compared to natural images and should not be neglected. Uniform LBP assigns a binary code to all non-uniform patterns, but for accurate description the structure of wavelet coefficients all of the patterns even non-uniforms are required. For example in white noise distortion the patterns are not uniform.

In this paper T = 1 is selected for detail subbands and T = 4 is selected for approximation subband. In the next step normalized histogram of these codes for each subband is calculated. Feature vector is made by combining each of the mentioned normalized histograms.

$$ {J}_{WLBP}(I)=\left({H}_{WLBP_{P, R,{T}_1}}^{sub_1},\dots, {H}_{WLBP_{P, R,{T}_N}}^{sub_N}\right) $$
(8)

To demonstrate the effect of choosing different subbands on the quality prediction, the proposed method is implemented in three ways. The total number of features for encoding a subband by WLBP 4 , 1 is 12. In the first proposed method (WLBP-I) the histograms of LBP codes related to HL1 and HH1 subbands are concatenated to form the feature vector. In this case the number of quality features is 24. In the second proposed method (WLBP-II) the information of HL1, LH1 and HH1 subbands are utilized and finally in the third method (WLBP-III) all subbands of single level wavelet decomposition have been used. Figure 5 shows the sailing image of LIVE database and its distorted versions with associated quality aware features of the first proposed method.

Fig. 5
figure 5

Sailing2 image of LIVE database and its distorted versions with associated features normalized histogram. Features are formed by concatenating the histograms of LBP codes corresponds to HL and HH subbands: a reference image; Distortion types of b-f are Gaussian Blur, White Noise, JPEG2000, JPEG and Fast Fading respectively

3.3 Quality prediction with support vector regression

Support vector regression is a method that estimates a function to map input patterns to their target values. Suppose that the training set is represented as follows:

$$ \left\{\left({x}_1, {y}_1\right),\left({x}_2, {y}_2\right),\dots, \left({x}_n, {y}_n\right)\right\}\subset \mathcal{X}\times \mathrm{\mathbb{R}} $$
(9)

Where xi is a n-dimensional vector and yi is a real number associated to xi. In the ε-SVR the estimated function should have a maximum of ε deviation with the target values for all training data. SVR is usually employed on high dimensional regression problems [22]. In this paper ε-SVR is first trained by the training set to map the quality aware features to their quality scores. After the function is estimated, the model will be able to predict the quality of images.

$$ {Q}_{predict}= SVR\left({J}_{WLBP}, function\right) $$
(10)

In the formula. Q predict is the objective quality score.

4 Experiments results and Discussion

In the proposed method LIVE and TID2008 databases are used for testing the algorithm performance. The LIVE database contains 29 reference images with 779 distorted images. In this database five types of distortion includes the distortions caused by JPEG and JPEG 2000 compression, additive white Gaussian noise, Gaussian blur and Rayleigh channel distortions has been applied to the images. Every distorted image in the database, along with a number between 0 and 100 called differential mean opinion score (DMOS) which in fact represents the subjective image quality, is provided. Whatever the DMOS is closer to zero, the image quality is higher.

The TID2008 database consists of 25 reference images and 1700 distorted images. 17 types of distortions in 4 levels have been introduced to the images. In TID2008 subjective quality scores are determine by mean opinion score (MOS) index. We tested the proposed method only on JPEG, JPEG2000 compression (JP2K), additive white noise (WN) and Gaussian Blur (blur) distortions in TID2008 database. Fast Fading distortion does not exist in the TID2008 database. The artificial image of TID2008 and its distorted versions is also considered in the simulations. In The proposed method databases split into two categories, 20% images for testing and 80% for SVR training so the two sets have no overlap with each other. In addition, each of the test and train sets should include reference images and their associated distorted images. To implement SVR LIBSVM package [2] is utilized. In our method ε-SVR with RBF kernel is used. Cost and gamma parameters are computed by searching in a two dimensional logarithmic space. This algorithm has been run 1000 times for every distortion. In each of these runs, test and train sets have been chosen randomly. The proposed method is compared with well-known NR methods. This comparison is done by the Spearman’s rank ordered correlation coefficient (SROCC) and Pearson’s (linear) correlation coefficient (LCC) metrics. These metrics specify the correlation between the two sets of variables. The higher value of SROCC and LCC means that the results have a high correlation to subjective evaluations. The median SROCC and LCC across 1000 test for LIVE database are shown in Tables 1 and 2. To demonstrate the accuracy of the algorithm standard deviation of SROCC and LCC values for LIVE database are listed in Table 3. The median SROCC across 1000 test for TID2008 database is tabulated in Table 4.The two best NR-IQA results are highlighted in bold.

Table 1 Median srocc comparison across 1000 train-test combinations on the LIVE database
Table 2 Median lcc comparison across 1000 train-test combinations on the LIVE database
Table 3 Standard deviation of srocc and lcc values across 1000 tests on live database
Table 4 Median srocc comparison across 1000 train-test combinations on the TID2008 database

As is clear from the results our proposed method is competitive to all NR-IQA algorithms. In the third proposed method (WLBP-III) encoding the LL1 subband improves the JP2K results in both LIVE and TID2008 databases. It means that the information of LL1 subband or low frequency information is useful in the quality estimation of JP2K distortion. Blur results is also enhanced for TID2008 database in the third proposed method. The proposed method compared with NR-GLBP where the features were extracted by the LBP operator in LOG space, much smaller features is utilized to predict the image quality and unlike DIVIINE and BIQI methods, a simple 2D biorthogonal wavelet is used. All of these factors cause the proposed method has low complexity and suitable for real-time NR-IQA applications.

5 Conclusion

In this paper, a novel method for no-reference image quality assessment has been introduced. Quality aware features are extracted from subbands of wavelet coefficients by LBP operator. The proposed algorithm has been tested on LIVE and TID2008 databases and the results are competitive to well-known NR-IQA methods in terms of correlation to subjective scores. In the first proposed approach, with a 24 dimensional feature vector we could achieve results compared to most state-of-the-art NR-IQA algorithms. Our method due to the use of LBP 4 , 1 that has a low computational load and a simple 2D wavelet has low computational complexity and can be used in real time image quality assessment applications.