1 Introduction

A number of factors, including low light photography, out-of-focus image capturing, processing, and compression, among others, can degrade images. Degraded photos may be evaluated for quality using subjective or objective approaches. However, objective methods are employed for automatic evaluation. It can be classified as full-reference (FR), reduced-reference (RR), or no-reference (NR) depending on whether the reference image is available. While only partial information about the reference images is given in the RR approach, it is presumed that the reference image is available in the FR method. However, in many instances, such as when a defocused camera is used to take an image, generating blurriness, or when low light is present, causing additive white noise (AWGN) distortion, neither the reference image itself nor any of its information is available for comparison. Consequently, the natural solution is the NR approach [1].

The methods for evaluating quality can also be divided into general-purpose and distortion-specific categories. The quality of an image that has been altered by any particular distortion, including blockiness, blurriness, noise, and quantization, can be predicted using general-purpose algorithms. Even though general-purpose methods are preferred, not all distortion types are assessed with high accuracy. Also, due to their versatile nature, the computational complexity of general-purpose methods is high. Contrarily, distortion-specific algorithms are optimized for one or more chosen known distortion types, resulting in highly accurate and simple algorithms [2]. Therefore, distortion-specific approaches are typically favored if the kind and source of the image’s distortion are known. The possible cases of specific distortion of blurriness are discussed as follows. Gaussian blurring is one of the common and dominant types of distortion perceived in images when captured by low end cameras like the ones used in mobile phones. The blurriness is also introduced when unwanted details and the noise are removed from the image by smoothing. Sometimes, to meet the bandwidth or storage constraints, the high degree of compression is used despite the fact that some degree of blurriness may develop. These are the situations when the distortion type in the images is known. Therefore, suitable and efficient blurriness specific quality evaluation method is required to be developed. A number of blur specific NR methods to measure the quality of blurred images have been developed in the past and it is still an active research area. Some of the blur specific algorithms [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17] developed in the last decade are briefly reviewed as follows.

In [3], first, the sharpness has been measured in both spectral and spatial domain at block level, and then, their corresponding geometric means are computed. Top 1% of the means are selected to formulate the metric. As, this method uses spectral domain as well as spatial domain, therefore, results into higher computational complexity. Furthermore, the accuracy is not consistent across the different databases. A metric named as FISH/FISH\(_{bb}\) was proposed in [4]. This method uses Log-energies of each discrete wavelet transform (DWT) subbands to compute block level sharpness and then pooled to get quality of entire image. However, this method is not accurate over unseen images as it relies on empirical weight assignment by experimenting on images of seen databases. A complex wavelet transform-based method is developed in [5]. In this method, phases of complex wavelet coefficients are exploited to measure sharpness of an image.

To quantify the blurriness, Generalized Gaussian distribution (GGD) and asymmetric GGD (AGGD) parameters are computed over the estimated maximum local variation (MLV) in 3\(\times\)3 neighborhood [6]. By measuring these two parameters, MLVs are weighted throughout the image with higher weights are assigned to higher MLVs values. The sharpness metric is then formulated by computing their standard deviation. In [7, 8], reblurring of images has been used to develop the quality metric. In [8], a step-by-step approach is employed to determine the minimum standard deviation needed to generate valid reblur images. Using the difference between test and reblur images, a blur metric is formulated. Whereas in [7], the metric is developed by the information of shape of local histogram by taking into the account of difference between reblur and test images. The contraction of histogram distribution of the blurred images with respect to natural images has been exploited to formulate the metric in [9]. A learning-based sharpness metric which used Zernike moments and gradient magnitude (GM) in the framework was proposed in [12]. Zernike moments estimate irregularities and distortions caused by blurring, whereas GM quantify the loss of sharpness and fine details. Dictionary learning-based sharpness metric was proposed in [13]. As the atoms are usually edge patterns in the dictionaries and image blur is defined by edge spread, an over-complete dictionary was used to calculate the image sharpness. Its accuracy is high but computational complexity is high.

Free energy principle and auto-regressive based models were proposed in [11, 14]. In [11], free energy principle and the NFEQM model [18] were combined to develop a sharpness metric. It is high in accuracy, though computationally intensive. In [14], noise energy was computed at block level (known in the paper as ‘stem noise’), to quantify image blurriness. The characteristic of estimated energy reflects the blurriness of the image. This method found to be moderately accurate across different databases. One of the recent sharpness assessment metrics based on local contrast map (spatial domain) and DWT (transform domain) is proposed in [15]. This method is also not suitable for all the databases. In [16], a discrete Fourier Transform (DFT)-based method was proposed. In this metric development, the average of the ratio of the magnitude of AC coefficients before and after the addition of constant was computed over entire DFT coefficients. It was first computed over local level and then pooled to get overall quality score. This method was fast but moderate in accuracy. In [17], the difference between sharpest and blurriest spot of an image is exploited to measure the sharpness. It is fast and highly accurate.

Nowadays, due to the availability of computational power and large sets of labeled training images, a lot of researches are going in the field of image quality assessment which utilizes Deep Learning methods. Some of the methods that may be useful to assess the quality of blurred images can be found in [10, 19,20,21,22,23,24].

Though the accuracy of blur specific algorithms had improved in recent past and many highly accurate algorithms were developed. Most of these are either based on Machine Learning/Deep learning or parameter tuning while developing the model. However, due to the training methods, the trained model may not always be guaranteed for accurate quality estimation of images of unseen/future databases. In this paper, a highly accurate method independent of Machine/Deep Learning algorithms and databases is proposed. Moreover, its time complexity is lower than the existing state-of-the-art algorithms. Fast blur image quality assessment offers quick and efficient evaluation of image quality, making it suitable for real-time applications, large datasets, and resource-constrained environments. It provides prompt feedback and scalability, and maintains a balance between computational efficiency and accuracy. The proposed work is motivated by the fact that when blurring occurs, regions with different pixel variations are affected differently. As human visual system (HVS) is also more attentive to distortions in the regions with large pixel variations; therefore, it is meritorious to exploit them for blurriness estimation. Moreover, blurring also causes loss of overall details, thereby resulting in the decrease in mean pixel variation and the maximum pixel variation. The ratio of mean to maximum pixel variation is observed as an indicator of the degree of blurriness. Motivated by these facts, the maximum pixel variation and mean pixel variation are utilized to estimate the quality of the image affected by the blurriness.

The rest of this paper is organized as follows. It starts with the detailed discussion of the proposed methodology in Sect. 2. It is then followed by performance validation and comparison with the existing methods for various databases in Sect. 3. Finally, the paper is concluded in Sect. 4.

2 Proposed methodology

As discussed earlier, variation in pixel values is an indication of image sharpness/blurriness. The pixel variations of an image can be evaluated by a high-pass Laplacian filter coefficients [2, 25] as given by H in Eq. 1

$$\begin{aligned} H=\dfrac{1}{4} \begin{bmatrix} 0 &{} -1 &{} 0\\ -1 &{} 4 &{} -1 \\ 0 &{} -1 &{} 0 \end{bmatrix} \end{aligned}$$
(1)

For example, to observe the pixel variations of image Monarch (Fig. 1a), it is convoluted with the high-pass Laplacian filter H. The resulting filtered image representing its pixel variations map is shown in Fig. 1b.

Fig. 1
figure 1

(a) Monarch image. (b) The pixel variations map

On comparing the two images, it is observed that small pixel variations (smooth regions) are reflected by darker pixels, whereas large pixel variations (sharp regions) are reflected by brighter pixels.

It is well known that an image with Gaussian blur distortion can be mathematically represented by convolution of the image with a Gaussian low-pass filter. Let the blurred image be denoted by I(ij). As the value of standard deviation (\(\sigma\)) of the filter increases, blurriness in the image also increases. As HVS is most attentive to the region where pixel variation is maximum, therefore, it is important to see how the blurriness affects the maximum of absolute pixel variation(M).

Fig. 2
figure 2

Plot of maximum pixel variation (M) vs. blurriness level (\(\sigma\))

Fig. 3
figure 3

Scatterplot of maximum pixel variation (M) vs. blurriness level (\(\sigma\))

To measure the value of M, first pixel-wise blurred image I(ij) is convolved(\(\circledast\)) with H to obtain the high-pass filtered output \(I(i,j)'\) as given in Eq. 2

$$\begin{aligned} {I'(i,j) = (I \circledast H)(i,j)} \end{aligned}$$
(2)

M is then obtained by finding the maximum absolute value in I(ij)’ as given in Eq. 3

$$\begin{aligned} {M=max[|I(i,j)'|]} \end{aligned}$$
(3)

The plot of M as a function of blurriness level (\(\sigma\)) for four different natural images (from LIVE database) is shown in Fig. 2. It is clear that maximum pixel variation M decreases with the increase in blurriness level (\(\sigma\)). For further validation, scatter plot between the maximum pixel variation M and \(\sigma\) is plotted for the images of LIVE [26] database and is shown in Fig. 3. From the scatter plot, it is also clear that maximum pixel variation M is large for less blurred images (small \(\sigma\)) and small for more blurred images (large \(\sigma\)). On computing the correlation (SROCC) between M and \(\sigma\), it is found to be high (i.e., SROCC = 0.992). Moreover, M and \(\sigma\) are inversely related, i.e., maximum pixel variation decreases with increase in standard deviation/blurriness. In NR quality evaluation, \(\sigma\) is usually not available; therefore, maximum pixel variation M can be exploited as a feature to measure the blurriness of the image and is given by Eq. 4

$$\begin{aligned} Q_1=\frac{1}{M} \end{aligned}$$
(4)

When the maximum value is high, indicating sharp features, the inverse \({(Q_{1})}\) will be low. Conversely, when the maximum value is low (indicating blurriness), the \({(Q_{1})}\) will be high. This feature essentially emphasizes the importance of having sharp elements in the image for higher quality scores. Further, blurring also causes loss of details thereby resulting in the decrease in the mean of absolute pixel variation \({\bar{m}}\) (Fig. 4) and the maximum pixel variation M (Fig. 2). On comparing Figs. 2 and  4, it is seen that the value of M decreases more rapidly in comparison to \({\bar{m}}\) as the blurriness level increases. It is observed that the ratio of mean pixel variation to maximum pixel variation, \({\bar{m}}/M\), increases with increase in blurriness as evident from Fig. 5. When this ratio is high, it indicates that, on average, the edges and transitions in the image are relatively intense compared to the sharpest features. Conversely, a low ratio suggests that there are pronounced sharp features in the image compared to the overall average. This metric effectively distinguishes between images with uniform sharpness and those with localized areas of high sharpness. Thereby, \({\bar{m}}/M\) may also be used as an indicator of level of blurriness. Therefore, the other quality indicator is given by Eq. 5

$$\begin{aligned} Q_{2}= \frac{{\bar{m}}}{M} \end{aligned}$$
(5)

Figure 6 shows an image with different level of blurriness and the corresponding quality indicator scores \(Q_{1}\) and \(Q_{2}\). From these values, it is clear that they are the reflector of blurriness in the image. The values of \(Q_{1}\) and \(Q_{2}\) are merged by weighted geometric mean to get unified quality score, Q, as given in Eq. 6

$$\begin{aligned} Q=Q_1^{\gamma } \times Q_2^{1-\gamma }\end{aligned}$$
(6)

The value of \(\gamma\) is determined empirically, and from extensive simulation, the value of \(\gamma\) = 0.5 is found to give the best result. This value of \(\gamma\) signifies that both the values of \(Q_1\) and \(Q_2\) are equal in weightage. The effect of different values of \(\gamma\) over the quality score is discussed in Sect. 3.1 of the next section. The process of computing quality metric Q is shown in the block diagram given in Fig. 7.  To measure the pixel variations, first, an image is high-pass-filtered by H followed by computing the maximum pixel variation (M) and mean (\({\bar{m}}\)) pixel variation. Then, using Eqs. 4 and 5, \(Q_{1}\) and \(Q_{2}\) are computed. Finally, the weighted geometric mean of \(Q_{1}\) and \(Q_{2}\) are used to compute overall quality score Q.

Fig. 4
figure 4

Plot of mean pixel variation (\({\bar{m}})\) vs. blurriness level (\(\sigma\))

Fig. 5
figure 5

Plot of ratio of mean pixel variation to maximum pixel variation (\({\bar{m}}/M)\) vs. blurriness level (\(\sigma\))

Fig. 6
figure 6

Quality score \(Q_{1}\) and \(Q_{2}\) of an image with different levels of blurriness

Fig. 7
figure 7

Block diagram of the proposed method

3 Results and discussion

In this section, the effectiveness of the proposed algorithms is thoroughly assessed and compared with that of other algorithms.

3.1 Experimental settings

The performance of the proposed method is evaluated on the standard databases, namely LIVE [26], VCL [27], TID2008 [28], TID2013 [29], and CSIQ [30]. While mean opinion score (MOS) of the blurred images is provided in TID2008, TID2013, and VCL databases, blurred images with differential mean opinion score (DMOS) scores are provided in LIVE and CSIQ. The accuracy is measured by computing Spearman\('\)s-rank-order-correlation-coefficient (SROCC), Pearson\('\)s-linear-correlation-coefficient (PLCC), and root-mean-square-error (RMSE) using the predicted subjective scores and the true subjective scores (MOS/DMOS) [15]. PLCC and RMSE measure the linearity and error, respectively, between the predicted and true scores.

The magnitudes of PLCC and SROCC are in the range [0, 1], where the greatest value of 1 denotes a perfect linear and monotonic relationship, respectively, and the minimum value of 0 denotes a complete lack of correlation. When the subjective scores are predicted perfectly, the RMSE value is 0. It increases with increase in difference of predicted and true subjective scores. Also, the range of subjective scores across different databases varies; therefore, range of RMSE values differ accordingly.

To predict the perceived subjective quality scores (MOS/DMOS), the quality metric Q is mapped to MOS/DMOS for entire databases using the logistic fitting function [15, 31] given in Eq. 7

$$\begin{aligned} DMOS=\psi _{1}\left( \frac{1}{2}-\frac{1}{1+e^{\psi _{2}(Q-\psi _{3})}} \right) +\psi _{4}Q+\psi _{5}\end{aligned}$$
(7)

In Eq. 7, \(\psi _{1},\psi _{2},\psi _{3},\psi _{4}\), and \(\psi _{5}\) are the model parameters. It is worth to note that the value of the model parameters varies with the databases as they differ in methodology of subjective rating. Furthermore, logistic fitting function with four model parameters [4, 17]is also used which may give slightly different PLCC and RMSE; however, the SROCC remains same. Moreover, to evaluate the performance of training free methods, computing the monotonicity (SROCC) between Q and subjective scores from the given databases is sufficient enough as the test images may come from unknown/future databases and independent of number of parameters used in logistic function [14]. However, whenever required, both PLCC and SROCC are included in the discussion.

The proposed algorithm is implemented using MATLAB 2018a on a computer equipped with an Intel R Xeon 2.13 GHz CPU, 20 GB of RAM, and a 500 GB hard drive.

As discussed in the last section (see Eq. 6), the quality metric Q depends on empirically computed \(\gamma\). The SROCC between Q and MOS/DMOS with changes in \(\gamma\) is computed for all the image databases as given in Table 1.

Table 1 SROCC for different values of \(\gamma\) over different databases

The best SROCC for each database is also highlighted in the table. It may be observed that the best SROCC varies for \(\gamma\) in the range of [0.4, 0.7]. By analyzing, it is evident that the value of \(\gamma\)=0.5 may be the best choice if SROCC across the databases is considered and, therefore, is used throughout the paper. Moreover, the PLCC and RMSE given in Table 2 and Table 3, respectively, also support the chosen value of \(\gamma\).

Table 2 PLCC for different values of \(\gamma\) over different databases
Table 3 RMSE for different values of \(\gamma\) over different databases

Moreover, Table 1 also gives the individual effect of \(Q_1\) and \(Q_2\) on the quality metric Q. When the value of \(\gamma\)=0, it implies that the value of metric depends on \(Q_2\) only (as evident from Eq. 6). Similarly, for \(\gamma\)=1, the quality metric depends on \(Q_1\) only. In both these cases, the SROCC is low. Therefore, it may be deduced that combining both \(Q_1\) and \(Q_2\) is worthwhile as it gives better accuracy.

3.2 Accuracy over a set of images

First, the quality score Q is computed for an image with four different levels of blurriness, as shown in Fig. 8. It is evident that the objective quality score Q is consistent with the subjective DMOS score. A higher value of Q reflects higher value of DMOS, whereas a lower value of Q reflects lower value of DMOS. Furthermore, scores Q can clearly discriminate image with different levels of blurriness. Next, the quality scores Q is computed for four different images with similar level of blurriness, as shown in Fig. 9. Since the images are having similar level of blurriness, their DMOS values are almost similar and so are the predicted quality scores Q. Here also the scores Q are consistent with the DMOS scores indicating the good prediction accuracy of the proposed algorithms.

Fig. 8
figure 8

Subjective score (DMOS) and quality metric (Q) for images with same content and different degree of blurriness

Fig. 9
figure 9

Subjective score (DMOS) and quality metric (Q) for images with different content and similar degree of blurriness

For further validation of the proposed algorithm, the quality scores Q is computed for all the blurred images of LIVE database. Figure 10 shows the scatter plot with logistic fitting curve between the objective quality score Q and the subjective DMOS score. From the figure, it is observed that with increasing DMOS (blurriness in the image), the quality scores Q also increases and are highly correlated. Using the logistic fitting function of Eq. 7, the quality score Q (logarithmic value in base 10) is used to obtain the predicted DMOS (\(DMOS_{P}\)). The values of parameters obtained by logistic fitting are \(\psi _{1}=-44.31,\psi _{2}=6.171,\psi _{3}=13.66,\psi _{4}=62.1,\psi _{5}=93.22\). The values of \(DMOS_{P}\) for the images of Figs. 8 and 9 are indicated under respective images. For the entire images of LIVE database, the scatter plot of \(DMOS_{P}\) and true DMOS is shown in Fig. 11. Form these results, it is apparent that the \(DMOS_{P}\) is close to the true DMOS. It is worth to note that in LIVE database, the range of DMOS is around [0-100] for pristine images and its 5 different level of blurriness.

Fig. 10
figure 10

Scatter plot of subjective score (DMOS) vs. quality metric (Q)

Fig. 11
figure 11

Scatter plot of true and predicted DMOS

3.3 Accuracy comparison over standard image databases

The effectiveness of the proposed metric is also evaluated in comparison to other state-of-the-art algorithms for images from standard databases, including LIVE, VCL, TID2008, TID2013, and CSIQ. Except for the algorithms that have a dash (−) in front of them, the MATLAB code for these algorithms is publicly available. Table 4 gives the SROCC (between Q and DMOS) comparison with the other algorithms. The top three performers in each database are highlighted in bold font. It may be observed that the proposed metric is among one of them. While other algorithms as discussed follows are not consistent with accuracy for each of the databases. For example, the algorithms such as SPARISH [13] and ARISM/ARISM\(_{c}\) [11] are among highly accurate algorithms for LIVE database; however, the accuracy suffers for TID2008, TID2013, and CSIQ databases. Among the tabulated algorithms, [17] is most consistent. The proposed method is highly competitive with [17] except for TID2008 database.

Table 4 SROCC comparison of the proposed method with state-of-the-art NR blur specific metrics
Table 5 PLCC comparison of the proposed method with state-of-the-art NR blur specific metrics

By comparing the PLCC and RMSE, as given in Table 5 and Table 6, it may be observed that the proposed method is among the top three performers except for LIVE database. Therefore, overall, the proposed method is highly competitive in accuracy.

Table 6 RMSE comparison of the proposed method with state-of-the-art NR blur specific metrics

3.4 Statistical significance test

To establish the efficiency of the proposed method, statistical significance test is done using the F test (one-tailed, 5% significance level) against some competitive metrics, as given in Table 7. The F test method from [1] is followed in this paper. In the table, the value of ’1’ reflects that the proposed metric is more significant than the metrics in the corresponding rows, while ’0’ indicates that the tabulated metric is equally significant as the proposed one. From the table, it may be observed that the proposed metric outperforms the listed algorithms for most of the databases, reflecting the effectiveness of the proposed method.

Table 7 Statistical significance test (F test)

3.5 Comparison with deep learning methods

The proposed method is also compared with state-of-the-art Deep Learning-based quality assessment methods, as given in Table 8. Most of the Deep Learning-based methods are inconsistent in accuracy across the different databases. Even though the proposed method is training free, it is better and robust than the learning-based tabulated methods.

Table 8 SROCC comparison of the proposed method with Deep learning metrics

3.6 Comparison with NR general-purpose algorithms

The proposed algorithm is also compared with state-of-the-art general-purpose algorithms as tabulated in Table 9. The best two algorithm for each database is shown in bold font. From the table, it is observed that the proposed method is highly competitive over databases of VCL and TID2008 and outperforms all other algorithms over LIVE, TID2013, and CSIQ databases. Also, it may be noted that the performance of the proposed metric upholds the necessity of distortion-specific metrics.

Table 9 SROCC comparison of the proposed method with state-of-the-art NR general-purpose metrics

3.7 Time complexity comparison

Other than accuracy, complexity is also an important factor of performance evaluation in real-time applications. The complexity of different algorithms are measured in terms of time-complexity and is given in Table 10. From this table, it is evident that the proposed method is simpler as it has relatively very low time-complexity in comparison to other algorithms. LPC-SI is 9 times and BISHARP is 25 times slower than the proposed method. The method proposed in this paper is significantly better in terms of time-complexity (3\(\times\)faster) than [16]. It is also simpler than the competitive metric given in [17].

Table 10 Time complexity (in seconds) of different algorithms for an image of size 768\(\times\)512 pixels

4 Conclusion

The variation in neighboring pixels of an image is one of the indicators of blurriness. In this paper, the maximum and mean pixel variation of an image has been exploited to estimate the quality of a blurred image. The proposed method has been found highly accurate and competitive across the standard databases for distortion specific as well as general-purpose algorithms. Moreover, it also outperforms the most of the existing algorithms in terms of time complexity and hence may be used in real-time applications.