Abstract
The reception of multimedia applications often depends on the quality of processed and displayed visual content. This is the main reason for the development of automatic image quality assessment (IQA) techniques which try to mimic properties of human visual system and produce objective scores for evaluated images. Most of them require a training step in which subjective scores, obtained in tests with human subjects, are used for parameters tuning. In this paper, it is shown that pairwise score differences (PSD) can be successfully used for training a full-reference hybrid IQA measure based on the least absolute shrinkage and selection operator (lasso) regression. The results of extensive experimental evaluation on four largest IQA benchmarks show that the proposed IQA technique is statistically better than its version trained using raw scores, and both approaches are statistically better than state-of-the-art full-reference IQA measures. They are also better than other hybrid approaches. In the paper, the evaluation protocol is extended with tests using PSD.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
With the rapid development of different image processing methods, a considerable attention is given to techniques which are trying to mimic human visual perception [9, 46]. Image quality assessment (IQA) techniques measure the quality of presented images, and often support, or compare, image enhancement, restoration, or denoising techniques [3, 5, 29, 32]. According to the availability of a reference image, IQA measures are divided into full-reference, reduced-reference, and no-reference techniques [4, 19, 31, 47].
This paper focuses on the full-reference IQA measures. Over the last decade, many different full-reference IQA approaches have been introduced [4], starting from simple peak signal-to-noise ratio (PSNR), or noise quality measure (NQM) [8] in which a linear frequency distortion and an additive noise injection are used for the distorted image modelling. The popular structural similarity (SSIM) measure [41], being the extension of universal image quality index (UQI) [39], uses loss of correlation, luminance distortion, and contrast distortion. SSIM was further extended using a multi-scale approach (MSSIM) [41] or statistical models of natural scenes, as it can be seen in information content weighted SSIM (IWSSIM) [40]. Such statistical models are also utilised in information fidelity criterion (IFC) [33] and visual information fidelity (VIF) [34]. Feature similarity index (FSIM) [51] and its version for colour images (FSIMc), employ phase congruency and image gradient magnitude. In Riesz-transform based feature similarity measure (RFSIM) [50] Riesz-transform features at key locations between the distorted image and its reference image are used. SURF-SIM [38], in turn, uses Speed Up Robust Features (SURF) in order to detect multiscale differences between features. Spectral residual based similarity measure (SRSIM) [49] and visual saliency-induced index (VSI) [52] use visual saliency maps. Contrast changes and luminance distortions are used in gradient similarity (GSM) measure [16], and inter-patch and intra-patch similarities were modelled in [54] using modified normalised correlation coefficient and image curvature. The edge based image quality assessments (EBIQA) measure is based on different edge features extracted from distorted image and its pristine equivalent [1]. In [11], a novel pooling strategy based on the harmonic mean was proposed.
In the literature, there are also approaches in which several IQA techniques are combined into a hybrid, or a fusion, measure. For example, in the most apparent distortion algorithm (MAD) [14] local luminance and contrast masking evaluate high-quality images, while changes in the local statistics of spatial-frequency components are used for images with a low quality. Information obtained using saliency maps, gradient and contrast information was fused in [30]. In [21, 22], scores of MSSIM, VIF and R-SVD were non-linearly combined. A preliminary work with non-linear combination of several IQA measures selected by a genetic algorithm was shown in [23]. In [17], SNR, SSIM, VIF, and VSNR were combined using canonical correlation analysis, and a regularised regression was used to combine up to seven IQA models in [13]. In [25], a support vector machines classifier used for predicting of the distortion followed by a fusion of SSIM, VSNR, and VIF using k-nearest-neighbour regression, was proposed. An adaptive combination of IQA approaches with an edge-quality based on preservation of edge direction was introduced in [26]. In [18], a fusion measure using a support vector regression approach was proposed. Lukin et al. [20] introduced a fusion measure which combines six IQA measures using a neural network. In [48], kernel ridge regression was used to combine found perceptually meaningful structures and local distortion measurements. In other approaches, adaptive weighting [2] or internal generative mechanism [43] were considered in order to obtain hybrid measures.
For evaluation of IQA approaches, specific IQA benchmark databases have been introduced [14, 27, 28, 42]. They contain pristine images, their distorted equivalents and subjective human evaluations in the form of mean opinion scores (MOS) or differential MOS (DMOS). Some images from these benchmarks with subjective scores are often used for tuning parameters of many developed methods, e.g., [13, 20, 44, 52, 54]. Here, the number of used images should be small in order to obtain the benchmark-independent solution. In this paper, a novel full-reference hybrid IQA measure is proposed which employs regularized least-squares regression using the least absolute shrinkage and selection operator (lasso) [36, 37]. This technique combines objective scores produced by up to 16 full-reference IQA measures. The lasso regression was applied since it performs selection of the most important predictors, what makes the usage of such combined measure more practical. Finally, only several IQA measures take part in the fusion. In the proposed approach, the regression coefficients are determined using part of images and their scores from benchmark databases. It is shown that the proposed hybrid measure is significantly better if pairwise score differences (PSD) are used instead of raw scores. These differences can be also used for performance evaluation of IQA measures. The application of PSD can be motivated by the organisation of some IQA tests with human subjects [28], where the observer compares distorted images with each other, taking into account the pristine image. It can be assumed that PSD can also be used in the development of IQA measures that require supervised learning. The hybrid measures developed using raw scores or PSD in the lasso regression are compared with the state-of-the-art techniques on four largest IQA benchmark image datasets using well-established evaluation protocol, as well as statistical significance tests.
The rest of this paper is organised as follows. Section 2 presents the proposed hybrid IQA measure. In Section 3, the approach is compared with state-of-the-art measures using four IQA benchmarks, and, finally concluding remarks are presented in Section 4.
2 Proposed approach
Let Q 1, …,Q M be the objective scores of M IQA measures seen as predictor variables in multiple linear regression model [36]. In the model, S o is the estimated response, or objective score, of the resulted hybrid IQA measure. It can be written as follows:
where B contains fitted coefficients estimated by minimising the mean squared difference between the outcome, i.e., the vector of subjective scores S s, and predicted outcome, S o; 𝜖 represents a relationship between Q and S s which is rejected from the equation.
For large number of predictors, it would be desirable to select those which are the most informative. This also leads to a more practical IQA hybrid approach, consisting of only several IQA measures. One of possible approaches to the problem of predictor selection is to use a penalised regression in the lasso form [36]. In the regression, for a given λ, the lasso determines B solving the following optimisation problem:
where N is the number of objective scores, and λ is a regularization parameter. In other words, the lasso minimises the residual sum of squares with the constant α:
In the approach, λ value which minimises the mean squared error was used to determine coefficients.
In experiments, the following M=16 publicly available full-reference IQA measures were used: VSI [52], FSIM [51], FSIMc [51], GSM [16], IFC [33], IW-SSIM [40], MAD [14], MSSIM [41], NQM [8], PSNR [35], RFSIM [50], SR-SIM [49], SSIM [42], VIF [34], IFS [7], and SFF [6]. They were used for assessment of processed images and then PSD were obtained. Most of these approaches present state-of-the-art performance, and their inclusion was mainly influenced by the need of achieving a broad sample of various approaches mimicking human visual system. It is assumed that the lasso regression would be able to select several IQA measures and develop the well-performing hybrid measure.
The proposed approach uses first 20 % images and their subjective scores from a given benchmark dataset, in order to obtain the regression coefficients. There are four largest IQA benchmark image datasets; therefore, four hybrid measures are introduced. In the literature, different numbers of images with scores were used for this purpose, ranging from 20 % [38], through 30 % [52], 50 % [48], and 100 % [13, 26, 44], to several datasets jointly [54].
In experiments, the following four largest IQA benchmarks were used: TID2013 [28], TID2008 [27], CSIQ [14], and LIVE [42]. The number of images in each benchmark, as well as the number of distortions and their levels are shown in Table 1. Since the number of learning images in the subset is small, the number of scores used in the regression can be considerably increased by employing PSD. To the best knowledge of the author, PSD have not been used for training of IQA measures. In this paper, the lasso regression produces the hybrid IQA measure trained with the small subset of images and scores obtained for M=16 IQA measures, as well as trained with pairwise differences of these scores. The obtained fitted coefficients, B, indicate the number and contribution of the most informative IQA measures. Only these measures were used in the quality assessment of the test images. For a selected reference image, all score differences between its distorted equivalents are calculated. For example, for 5 images with 24 distortions and 5 distortion levels, 600 images and scores are available in the typical learning scenario, or, as it is introduced in this paper, \({\sum }_{k=1}^{5} \displaystyle {120 \choose 2} = 7140\) pairwise score differences for these images can be used. It is assumed that only scores of distorted images that have the same reference image are compared. The usage of PSD can be also motivated by the tristimulus methodology for performing tests with human observers [28], in which two distorted images are presented with their pristine equivalent at the same time. Then, the observer selects which distorted image has the better quality, what requires evaluation of each distorted image separately, looking at the pristine image, and jointly, while making the decision on their relative quality. Such pairwise image comparison is used to determine subjective opinions for assessed images [28].
In the experiments, images from a given benchmark dataset were divided into five disjoint subsets. There are 20 % of all images in each subset, and each image was evaluated by 16 IQA measures. Finally, after the application of the proposed approach, 40 hybrid IQA measures were obtained; half of them was trained on PSD. For convenience of presentation, only measures, namely lasso regression SImilarity Measures (lrSIMs), obtained for the first 20 % of benchmark images are written as follows:
In the (4)–(11), the number in the subscript denotes the benchmark whose part of images were used for the development of the measure, 1 for TID2013, 2, 3, 4 for TID2008, CSIQ, and LIVE, respectively; the number in superscript, in turn, indicates that the measure was developed using scores ”1” or PSD ”2”; the letter in the subscript denotes which subset of training images was used (five letters: a-e). In the evaluation (see Section 3), results for the a subset or for all subsets together in the form of the mean value are reported. Taking into account all obtained hybrid measures, it can be said that one hybrid measure uses 5.725 single IQA measures, on average. Among the mostly used IQA measures, VIF was selected 40 times, MAD 39, IFS 29, FSIM 25, RFSIM 22, PSNR 20, SFF 19, and VSI 17. The remaining IQA measures were used less than 15 times each. Interestingly, FSIMc was not used at all, and NQM, SSIM, or MSSIM were used less than five times, what can indicate that their features were replaced by the remaining IQA measures. Some measures contributed more than others, what is reflected by the weights. For example, in hybrid measures (4)–(5) VSI was the most contributing technique, in (6)–(7) MAD with SFF contributed more than other techniques, and in (8)–(11) MAD with VIF. The sign the weight mostly depends on the sign of correlation between the objective scores produced by the measure and the subjective scores in the benchmark. Experiments were performed using Matlab 7.14 with Statistics Toolbox.
3 Experimental evaluation
According to the widely-used protocol [10, 35], IQA measures are compared with each other using the following performance indices: Spearman Rank order Correlation Coefficient (SRCC), Kendall Rank order Correlation Coefficient (KRCC), Pearson linear Correlation Coefficient (PCC) and Root Mean Square Error (RMSE). The prediction monotonicity is evaluated by SRCC and KRCC, and the prediction accuracy by PCC and RMSE. These performance indices are calculated after a non-linear mapping between a vector of objective scores, S o, and subjective scores, MOS or DMOS, denoted here by S s, using the following mapping function for the non-linear regression [35]:
where β=[β 1, β 2, …,β 5] are parameters of the non-linear regression model [35], and \(\boldsymbol {{{S_{m}^{o}}}}\) is the non-linearly mapped S o. PCC and RMSE use S o:
where \(\bar {\boldsymbol {{S_{m}^{o}}}}\) and \(\bar {\boldsymbol {S^{s}}}\) denote the mean-removed vectors.
SRCC is calculated as:
where d i is the difference between i th image in \(\boldsymbol {\hat {S}}\) and S, m is the total number of images. In KRCC, the number of concordant pairs in the dataset, m c , is used, as well as the number of discordant pairs, m d :
The values of RMSE closer to 0 are considered better, in contrary to SRCC, KRCC, and PCC whose values should be close to 1.
Table 2 contains mean values of evaluation indices for all developed hybrid measures on four benchmarks. There are 178500 values of PSD for TID2013, 56950, 12068, and 10081, for TID2008, CSIQ, and LIVE, respectively. It can be seen that within the benchmark, mean values for hybrid measures trained with PSD, indicated with ”2” in the superscript (e.g., lrSIM\(_{1\textit {-}4}^{2}\)), were in almost all cases better than hybrid measures learned using images from the benchmark and their raw scores, lrSIM\(_{1\textit {-}4}^{1}\). This indicates that PSD carry more information than raw scores.
Extension of the typical testing protocol with results obtained using PSD may lead to more quantitative conclusions about the performance of evaluated IQA measures. Therefore, the proposed hybrid measures, trained using PSD and represented by lrSIM\(_{1-4}^{2a}\), are compared with state-of-the-art IQA measures using four performance indices calculated with raw scores, and with PSD. The results of comparison are shown in Tables 3 and 4. The overall results for RMSE do not take into account LIVE dataset due to the range difference, and weighted results were obtained using the number of images in the benchmark as its weight. The tables contain the six best IQA measures that were considered in the regression, out of 16, and lrSIM\(_{1-4}^{2a}\), the four best performing measures for each evaluation index are written in bold. The obtained results reveal that all presented lrSIMs clearly outperformed compared IQA measures. For TID2013, where VSI was the best performing non-hybrid measure, hybrid measures trained on images from benchmark with considerably less number of distortions that are present in TID2013, i.e., on CSIQ and LIVE, performed worse than measures trained on TID benchmarks. IQA measures that can be seen in models trained on CSIQ and LIVE, which share most of distortion types, are performing worse on newly introduced TID benchmarks, and that can explain worse results for lrSIM\(_{3}^{2a}\) and lrSIM\(_{4}^{2a}\). However, these results are still better, taking into account overall performance, than for other non-hybrid IQA measures. Weighted results seem to favour IQA approaches performing better on TID2013 due to large number of images used as its weight. However, weighted results for tests with PSD show superior performance of all introduced hybrid measures. Interestingly, evaluation results on benchmarks with PSD, seen in Table 4, allow further assessment of the performance of compared non-hybrid IQA measures. There are cases in which some measures were better in this test than it was shown in the typical evaluation with raw scores. For example, for TID2008 with PSD, SFF was better than IFS and VSI, while in the previous evaluation, their precedence was reversed, with VSI as the leading technique. Interestingly, SFF was introduced before IFS by the same authors. Furthermore, for the results on CSIQ with PSD, MAD clearly outperformed newly introduced SFF and IFS, what was not evident in the known evaluation. MAD’s performance was also confirmed in tests on LIVE with PSD, where it was the fourth best IQA measure.
The evaluation results on benchmark datasets showed superior performance of introduced family of hybrid IQA measures, lrSIMs. However, it would be desirable to prove that the approach is statistically better. In statistical significance tests, hypothesis tests based on the prediction residuals of each measure after non-linear mapping were conducted using F-test [14], where the smaller residual variance denoted the better prediction. The results of statistical significance tests on LIVE benchmark are presented in Fig. 1. The tests cover all 16 IQA measures that were considered in the regression. In the figure, the number ”1”, ”0” or ”-1” in the cell denotes that the measure in the row is statistically better with the confidence greater than 95 %, indistinguishable, or worse than the measure in the column, respectively. The test revealed that lrSIMs are statistically better than other IQA measures, and in many cases, hybrid measures trained with PSD are better than hybrid measures trained with raw scores. Figure 2 presents summary of significance tests covering all benchmarks, including tests with PSD. For each benchmark, the numbers in cells were added. Since there are eight benchmarks, the number in the cell denotes the number of benchmarks in which the measure in the row is significantly better, or worse in case of the negative value, than the measure in the column. Taking into account the summary of significance tests, it can be seen that lrSIM\(_{2}^{2a}\) is the best performing measure, with non-negative values in cells. It is worth noticing that lrSIM\(_{2}^{2a}\) is worse that lrSIM\(_{1}^{2a}\) if only significance tests with non-hybrid measures are taken into account. All lrSIMs have non-negative values in cells shared with non-hybrid measures, where they are in rows, and hybrid IQA measures developed with PSD, lrSIM\(_{1-4}^{2a}\), are at the top of the ranking. Among non-hybrid approaches, SFF is the leading IQA measure, followed by FSIMc, MAD and VSI.
The experimental evaluation showed that it is worth using PSD in training of the proposed hybrid IQA measure family. This can also be seen on scatter plots with subjective opinion scores against objective scores of the two best IQA measures and lrSIM\(_{1}^{2a}\) on four benchmarks (see Fig. 3). Here, lrSIM\(_{1}^{2a}\) is better correlated with subjective scores than compared measures.
Since in this paper the hybrid approach is presented, it should be compared with other similar approaches that are present in the literature. Therefore, Table 5 contains comparison with such approaches on the basis of published SRCC values. This also gives the opportunity to compare the results with non-hybrid IQA measures, which also can be found in the literature. The table contains results for TID2008, CSIQ and LIVE benchmarks, since most of the compared measures were not evaluated on TID2013. Here, the best three results for a given benchmark are written in boldface, results not reported are denoted by ”-”. Furthermore, ”-” denotes overall results for IQA measures that were not evaluated on these three benchmarks or are not independent, i.e., authors developed a separate measure for each benchmark without providing cross-benchmark results ([13, 18, 20, 22, 25, 48]).
The results of comparison based on SRCC are presented in Table 5. They reveal that lrSIMs outperformed other measures, being in most cases among the three best IQA techniques. For TID2008, SM-HM-FSIM [11] was the second best technique. However, SM-HM-FSIM is non-hybrid approach, what makes all lrSIMs better than other compared hybrid approaches. The presented lrSIM family, together with the approach introduced by Barri et al. [2] outperformed other techniques on CSIQ. Overall results, as well as tests on LIVE, showed superior performance of lrSIMs over the other measures. More specifically, lrSIM\(_{1}^{2a}\) and lrSIM\(_{2}^{2a}\) were clearly better than other measures, what was indicated in the previous experiments.
4 Conclusions
In this paper, a hybrid full-reference IQA was introduced. The measure was obtained using the lasso regression and pairwise score differences of up to 16 IQA measures seen as predictors. The lasso was able to select the most important several IQA measures. This resulted in the family of hybrid measures, lrSIMs, which was extensively evaluated on four largest IQA image benchmarks employing SRCC, KRCC, PCC, and RMSE. The evaluation was also based on PSD. The introduced approach outperformed widely used full reference IQA measures, as well as other hybrid techniques. It can be assumed that the usage of PSD will support the development of other IQA measures based on supervised learning.
The Matlab code of the approach that allows adding other IQA measures, scripts performing pairwise score differences on used benchmarks, as well as the evaluation of compared approaches, are available to download at: http://marosz.kia.prz.edu.pl/lrSIMpsd.html.
References
Attar A, Shahbahrami A, Rad RM (2015) Image quality assessment using edge based features. Multimed Tools Appl:1–16. doi:10.1007/s11042-015-2663-9
Barri A, Dooms A, Jansen B, Schelkens P (2014) A locally adaptive system for the fusion of objective quality measures. IEEE T Image Process 23(6):2446–2458. doi:10.1109/TIP.2014.2316379
Bhandari AK, Kumar A, Chaudhary S, Singh GK (2015) A new beta differential evolution algorithm for edge preserved colored satellite image enhancement. Multidim Syst Sign:1–33. doi:10.1007/s11045-015-0353-4
Chandler DM (2013) Seven challenges in image quality assessment: Past, present, and future research. ISRN Signal Process 2013:53. doi:10.1155/2013/905685
Chandra A, Chattopadhyay S (2014) A new strategy of image denoising using multiplier-less fir filter designed with the aid of differential evolution algorithm. Multimed Tools Appl 75(2):1079–1098. doi:10.1007/s11042-014-2358-7
Chang HW, Yang H, Gan Y, Wang MH (2013) Sparse feature fidelity for perceptual image quality assessment. IEEE T Image Process 22(10):4007–4018. doi:10.1109/TIP.2013.2266579
Chang HW, Zhang QW, Wu QQ, Gan Y (2015) Perceptual image quality assessment by independent feature detector. Neurocomputing 151(part 3):1142–1152. doi:10.1016/j.neucom.2014.04.081
Damera-Venkata N, Kite TD, Geisler WS, Evans BL, Bovik AC (2000) Image quality assessment based on a degradation model. IEEE T Image Process 9(4):636–650. doi:10.1109/83.841940
Gao X, Lu W, Tao D, Li X (2010) Image quality assessment and human visual system. In: Visual communications and image processing 2010, international society for optics and photonics, pp 77,440Z–77,440Z
Group VQE (2003, [Accessed 26.06.2016]) Final report from the video quality experts group on the validation of objective models of video quality assessment, phase ii (fr_tv2). https://www.ituint/ITU-T/studygroups/com09/docs/tutorial_opavcpdf
Huang Y, Chen X, Ding X (2015) A harmonic means pooling strategy for structural similarity index measurement in image quality assessment. Multimed Tools Appl:1–12. doi:10.1007/s11042-015-2620-7
Jin L, Egiazarian K, Kuo CC (2012) Perceptual image quality assessment using block-based multi-metric fusion (BMMF). In: Proc. IEEE Int. conf. on acoustics, speech and signal processing (ICASSP), pp 1145–1148. doi:10.1109/ICASSP.2012.6288089
Lahouhou A, Viennet E, Beghdadi A (2010) Selecting low-level features for image quality assessment by statistical methods. CIT 18(2)
Larson EC, Chandler DM (2010) Most apparent distortion: full-reference image quality assessment and the role of strategy. J Electron Imaging 19(1):011006. doi:10.1117/1.3267105
Li S, Zhang F, Ma L, Ngan KN (2011) Image quality assessment by separately evaluating detail losses and additive impairments. IEEE T Multimed 13 (5):935–949. doi:10.1109/TMM.2011.2152382
Liu A, Lin W, Narwaria M (2012) Image quality assessment based on gradient similarity. IEEE T Image Process 21(4):1500–1512. doi:10.1109/tip.2011.2175935
Liu M, Yang X (2009) A new image quality approach based on decision fusion. In: Ma J, Yin Y, Yu J, Zhou S (eds) Proc. Int. conf. on fuzzy systems and knowledge discovery (FSKD). IEEE, pp 10–14
Liu TJ, Lin W, Kuo CC (2013) Image quality assessment using multi-method fusion. IEEE T Image Process 22(5):1793–1807. doi:10.1109/TIP.2012.2236343
Lu W, Mei N, Gao F, He L, Gao X (2015) Blind image quality assessment via semi-supervised learning and fuzzy inference. Applied Informatics 2 (1):1–20. doi:10.1186/s40535-015-0010-x
Lukin VV, Ponomarenko NN, Ieremeiev OI, Egiazarian KO, Astola J (2015) Combining full-reference image visual quality metrics by neural network. In: Human vision and electronic imaging XX, Proc. SPIE, vol 9394, p 93940K. doi:10.1117/12.2085465
Okarma K (2010) Combined full-reference image quality metric linearly correlated with subjective assessment. In: Artificial intelligence and soft computing. Springer, pp 539–546. doi:10.1007/978-3-642-13208-7_67
Okarma K (2013) Extended Hybrid Image Similarity - combined full-reference image quality metric linearly correlated with subjective scores. Elektronika ir Elektrotechnika 19(10):129–132
Oszust M (2016) Decision fusion for image quality assessment using an optimization approach. IEEE Signal Proc Let 23(1):65–69. doi:10.1109/LSP.2015.2500819
Pei S C, Chen L H (2015) Image quality assessment using human visual dog model fused with random forest. IEEE T Image Process 24(11):3282–3292. doi:10.1109/TIP.2015.2440172
Peng P, Li ZN (2012) A mixture of experts approach to multi-strategy image quality assessment. In: Campilho A, Kamel M (eds) Image analysis and recognition, lecture notes in computer science, vol 7324. Springer, Berlin-Heidelberg, pp 123–130, DOI 10.1007/978-3-642-31295-3_15
Peng P, Li ZN (2012) Regularization of the structural similarity index based on preservation of edge direction. In: 2012 IEEE International conference on systems, man, and cybernetics (SMC), pp 2127–2132. doi:10.1109/ICSMC.2012.6378054
Ponomarenko N, Lukin V, Zelensky A, Egiazarian K, Carli M, Battisti F (2009) TID2008 - a database for evaluation of full-reference visual quality assessment metrics. Adv Modern Radioelectron 10:30–45
Ponomarenko N, Jin L, Ieremeiev O, Lukin V, Egiazarian K, Astola J, Vozel B, Chehdi K, Carli M, Battisti F, Kuo CCJ (2015) Image database TID2013: peculiarities results and perspectives. Signal Process-Image 30:57–77. doi:10.1016/j.image.2014.10.009
Rehman A, Rostami M, Wang Z, Brunet D, Vrscay ER (2012) Ssim-inspired image restoration using sparse representation. EURASIP J Adv Sign Process 2012 (1):1–12. doi:10.1186/1687-6180-2012-16
Saha A, Wu QMJ (2014) Full-reference image quality assessment by combining global and local distortion measures. CoRR arXiv:1412.5488
Shahid M, Rossholm A, Lövström B, Zepernick HJ (2014) No-reference image and video quality assessment: a classification and review of recent approaches. EURASIP Int J Image Video Process 2014(1):1–32. doi:10.1186/1687-5281-2014-40
Sharif M, Hussain A, Jaffar MA, Choi TS (2014) Fuzzy similarity based non local means filter for rician noise removal. Multimed Tools Appl 74(15):5533–5556. doi:10.1007/s11042-014-1867-8
Sheikh H, Bovik A, de Veciana G (2005) An information fidelity criterion for image quality assessment using natural scene statistics. IEEE T Image Process 14(12):2117–2128. doi:10.1109/tip.2005.859389
Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE T Image Process 15(2):430–444. doi:10.1109/TIP.2005.859378
Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. IEEE T Image Process 15(11):3440–3451. doi:10.1109/tip.2006.881959
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc:267–288
Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc 73(3):273–282
Wang F, Sun X, Guo Z, Huang Y, Fu K (2015) An object-distortion based image quality similarity. IEEE Signal Proc Let IEEE 22(10):1534–1537. doi:10.1109/LSP.2015.2413891
Wang Z, Bovik AC (2002) A universal image quality index. IEEE Signal Proc Let 9(3):81–84. doi:10.1109/97.995823
Wang Z, Li Q (2011) Information content weighting for perceptual image quality assessment. IEEE T Image Process 20(5):1185–1198. doi:10.1109/tip.2010.2092435
Wang Z, Simoncelli EP, Bovik AC (2003) Multi-scale structural similarity for image quality assessment. In: Proc. IEEE Int. conf. on signals, Systems, and Computers (ASILOMAR), pp 1398–1402
Wang Z, Bovik A C, Sheikh H R, Simoncelli E P (2004) Image quality assessment: from error visibility to structural similarity. IEEE T Image Process 13(4):600–612. doi:10.1109/tip.2003.819861
Wu J, Lin W, Shi G, Liu A (2013) Perceptual quality metric with internal generative mechanism. IEEE T Image Process 22(1):43–54. doi:10.1109/TIP.2012.2214048
Wu J, Lin W, Shi G (2014) Image quality assessment with degradation on spatial structure. IEEE Signal Proc Let IEEE 21(4):437–440. doi:10.1109/LSP.2014.2304714
Xue W, Zhang L, Mou X, Bovik AC (2014) Gradient magnitude similarity deviation: a highly efficient perceptual image quality index. IEEE T Image Process 23(2):684–695. doi:10.1109/tip.2013.2293423
Yang J, Hou C, Xu R, Lei J (2010) New metric for stereo image quality assessment based on hvs. Int J Imaging Syst Technol 20(4):301–307. doi:10.1002/ima.20246
Yang J, Lin Y, Gao Z, Lv Z, Wei W, Song H (2015) Quality index for stereoscopic images by separately evaluating adding and subtracting. PLoS ONE 10 (12):1–19. doi:10.1371/journal.pone.0145800
Yuan Y, Guo Q, Lu X (2015) Image quality assessment: a sparse learning way. Neurocomputing 159:227–241. doi:10.1016/j.neucom.2015.01.066
Zhang L, Li H (2012) SR-SIM: a fast and high performance IQA index based on spectral residual. In: Proc. IEEE Int. conf. on image processing (ICIP). IEEE, 10.1109/icip.2012.6467149
Zhang L, Zhang L, Mou X (2010) RFSIM: a feature based image quality assessment metric using riesz transforms. In: Proc. IEEE Int. conf. on image processing (ICIP). IEEE, 10.1109/icip.2010.5649275
Zhang L, Zhang L, Mou X, Zhang D (2011) FSIM: A feature similarity index for image quality assessment. IEEE T Image Process 20(8):2378–2386. doi:10.1109/tip.2011.2109730
Zhang L, Shen Y, Li H (2014) VSI: a visual saliency-induced index for perceptual image quality assessment. IEEE T Image Process 23(10):4270–4281. doi:10.1109/tip.2014.2346028
Zhang X, Feng X, Wang W, Xue W (2013) Edge strength similarity for image quality assessment. IEEE Signal Proc Let 20(4):319–322. doi:10.1109/LSP.2013.2244081
Zhou F, Lu Z, Wang C, Sun W, Xia ST, Liao Q (2015) Image quality assessment based on inter-patch and intra-patch similarity. PLoS ONE 10(3):e0116,312. doi:10.1371/journal.pone.0116312
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Oszust, M. Image quality assessment with lasso regression and pairwise score differences. Multimed Tools Appl 76, 13255–13270 (2017). https://doi.org/10.1007/s11042-016-3755-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3755-x