1 Introduction and literature

  1. 1.

    Nowadays, video technology and its applications has huge demand in communication systems. The importance for reliable video quality assessment is also increasing. In recent past many video quality assessment methods and metrics are came to light with variable computational complexity and accuracy. As per recent survey, e.g., [20], 66% of data traffic is due to video transmission from mobile devices.

  2. 2.

    The traditional video quality metrics, such as signal-to-noise ratio (SNR), peak-signal-to-noise ratio (PSNR), and mean squared error (MSE), though computationally simple, are known to disregard the viewing conditions and the characteristics of human visual perception [11]. Although subjective video quality assessment methods may serve the purpose, based on the groups of trained/untrained human evaluators. Further, to meet the standards ITU-T, these evaluation methods must follow straightened conditions such of viewing distance, test duration, room illumination, etc [16, 22] and these methods are time consuming expensive and laborious.

  3. 3.

    The validation tests, for objective video quality metrics, developed by VQEG is remarkable in this direction [23].

  4. 4.

    More recently, a full reference video quality metric called MOtion-based Video Integrity Evaluation (MOVIE) index was proposed by Seshadrinathan and Bovik [8], FLOSIM [13] and Ortiz-Jaramillo et al. [1]. The MOVIE model, strives to capture the characteristics of the middle temporal (MT) visual area of the visual cortex in the human brain for video quality analysis. Neuroscience studies indicate that the visual area MT is critical for the perception of video quality [2].

  5. 5.

    The response characteristics of the visual area MT are modelled using separable Gabor filter banks. The model described two indices, namely a Spatial MOVIE index that primarily captures spatial distortions and a Temporal MOVIE index that captures temporal distortions. However, the MOVIE index is not suitable which is considering the optical flow baged visual qualities assessment that emphasize on the HVS characteristics. In [4], proposed a method by considering direction of optical flow in video is used for video quality assessment scores. In [18] proposed a method for computing video quality assessment score by considering HVS and optical flow concepts.

Motion detection plays an important role in video analysis. Optical flow based motion estimation algorithms became more popular in estimation of motion trajectories [5, 6, 12]. There are significant contributions in the state of the art to assess the distorted video sequences [15, 19, 24, 25]. However, understanding the Human visual system yet an open area of research. Recently optical flow based video quality assessment algorithms are emerging which attempt to understand the human visual system characteristics while measuring the distortions in video sequences [13, 20]. Manasa et al. algorithm (FLOSIM) [13] have proposed an optical flow based full referenced video quality assessment algorithm by considering the statistical features of optical flow. Manasa et al. have presented the Full Reference (FR) technique to measure the perceptual annoyance that results from temporal distortions. The statistical features: mean, standard deviation of the flow magnitudes and minimum eigenvalue of a flow are considered to quantify the temporal distortions.

In the paper we claim that the consideration of the statistical feature, minimum eigenvalue, for measuring the randomness in the optical flow be improved. We can observe for certain videos, as in Fig. 1, that the video quality scores are far away from the DMOS scores. In Fig. 1, the x- axis represents various test video sequences and y- axis represents video quality scores in terms of DMOS and FLOSIM values. This motivated us to propose another model to capture the temporal distortions that results from random flow. We proposed to use the orientation feature of the optical flow which effectively quantifies the randomness in flow, as opposed to the use of minimum eigenvalue as in [13].

Fig. 1
figure 1

Comparison of FLOSIM video quality scores with DMOS scores

From the Fig. 2, we can observe that though the video quality scores we obtained using our proposed algorithm are more closer to the DMOS score when compared to the FLOSIM, the correlation coefficient with DMOS are lower. This motivated us to propose yet another measure, referred as INT-CWC measure, aimed to provide a comparative closeness score with DMOS for any two video quality assessment algorithms. We hypothesize that if majority of the video quality scores of an algorithm are closer to the profile of DMOS scores, its performance should be appreciated even few extreme deviations from DMOS exist.

Fig. 2
figure 2

Scatter plot for all the video sequences

The summary of our contributions are as follows. The article aims at addressing the issue of video quality assessment, emphasizing on human visual system. It first propose a full reference optical flow based video quality assessment algorithm by considering orientation features and then propose interquartile based comparative weighted closeness measure(INT-CWC). Our experimental results demonstrate that the first video quality assessment algorithm achieves much closer video video quality scores to DMOS scores and the INT-CWC measure is highly correlated with the video quality score profiles.

The rest of the article is organized as follows. In Section 2, we present our two proposed contributions. Section 3 presents the experimental results. In Section 4, we conclude the article.

2 Proposed work and methodology

The proposed work is presented in the following two sections. In Section 2.1, we present the enhanced optical flow based full reference video quality assessment algorithm. In Section 2.2, we present the INT-CWC: Interquartile based comparative weighted closeness measure.

2.1 Enhanced video quality assessment algorithm

We propose an HVS based full reference video quality assessment model which aimed to enhance that of Manasa et al. algorithm (FLOSIM) [13]. Manasa et al. algorithm at first computes temporal distortions and spatial distortions separately and then computes the combined final video quality score by their proposed pooling strategy. We follow the same approach, however the proposed model differs in computation of temporal features. Please refer the figure captioned overview of the proposed approach in Manasa et al. algorithm [13] and its overview can be seen in Fig. 3.

Fig. 3
figure 3

Overview of Manasa et al. Algorithm

To obtain the spatial distortion features the MS-SSIM index [26] is used. To extract the temporal features any optical flow estimation algorithm can be used. The idea behind temporal distortion features computation is that there exist a deviation in local statistical properties of optical flow due to distortions when compared to the undistorted optical flow statistics [13]. Further, Manasa et al. [13] algorithm claim that local mean μ|.| and local standard deviations σ|.| are well capable of capturing the local flow inconsistencies and randomness of optical flow can be represented by the minimum eigenvalue λ|.|. We claim that the use of feature minimum eigenvalue, to represent the randomness is not well capable. Due to the use of this feature for some video sequences such as Mobile and Calendar and Park run [21, 22] as in Fig. 1, the distortion scores of FLOSIM are higher than the DMOS values.

2.1.1 Notations and models

Let \(\mathbf {V}_{x}^{i} \) and \(\mathbf {V}_{y}^{i} \), be the x and y components of ith velocity matrices of size M × N respectively, can be computed using any optical flow estimation algorithm, [5, 6, 12], between two consecutive video frames Fj and Fj+ 1, each of size M × N, respectively. Let \(\mathbf {V}^{i} =(\mathbf {V}_{x}^{i} , \mathbf {V}_{y}^{i} )\). In the proposed model, we used Gunnar Farneback optical flow algorithm [5], and its the implementation is available in Matlab R2018a. From the velocity matrices obtained using Gunnar Farneback optical flow estimation algorithm, the magnitudes and scaled phase angles (in radians) can be defined as in (1) and (2) respectively.

$$ m_{xy}^{i} = \sqrt{{v_{x}^{i}}+{v_{y}^{i}}} $$
(1)

where \({v_{x}^{i}}\) and \({v_{y}^{i}}\) are the elements of \(\mathbf {V}_{x}^{i} \) and \(\mathbf {V}_{y}^{i} \) respectively. Let Mi be the ith magnitude matrix of size M × N, which contain all the magnitude values, \(m_{xy}^{i}\) as in (1), computed between the \(\mathbf {V}_{x}^{i} \) and \(\mathbf {V}_{y}^{i} \). Similarly the phase angle (orientation) between \({v_{x}^{i}}\) and \({v_{y}^{i}}\) is computed as in (2)

$$ \theta_{yx}^{i} = atan2 ({v_{y}^{i}}, {v_{x}^{i}}) \times(\frac{360}{\pi}) $$
(2)

Let Ai be the ith phase angle matrix of size M × N, which contain the elements, \(\theta _{yx}^{i}\), computed between the matrices \(\mathbf {V}_{x}^{i} \) and \(\mathbf {V}_{y}^{i} \). And \(\mathbf {A}^{i}(x,y)=\theta _{yx}^{i} \) denotes an element at coordinate (x,y).

2.1.2 Description

In Manasa et al. [13] algorithm, the temporal local statistical features are computed using the optical flow components. These local statistical features are computed frame-by-frame basis and each feature is computed on 7 × 7 optical flow patches. The three features computed on optical flow components are given in (3), (4), (5).

$$ \mathbf{f}_{\textbf{1}}^{i} = [{\mu_{1}^{i}} , {\mu_{2}^{i}} , {\mu_{3}^{i}}, \dots, \mu^{i}_{K\times L} ]^{T} , $$
(3)
$$ \mathbf{f}_{\textbf{2}}^{i} = [{\sigma_{1}^{i}} , {\sigma_{2}^{i}}, {\sigma_{3}^{i}}, \dots, \sigma^{i}_{K\times L} ]^{T} , $$
(4)
$$ \mathbf{f}_{\textbf{3}}^{i} = [{\lambda_{1}^{i}} , {\lambda_{2}^{i}} , {\lambda_{3}^{i}}, \dots, \lambda^{i}_{K\times L} ]^{T} , $$
(5)

Where \(\mathbf {f}_{\textbf {1}}^{i}\) denotes the per frame feature vector of Vi, and μ|.| denotes the mean of the flow magnitudes, \(m_{xy}^{i}\) as in (1), that are present in a local patch of size 7 × 7. Similarly, \(\mathbf {f}_{\textbf {2}}^{i}\) is the second per frame feature vector of Vi, and σ|.| is the standard deviation of flow magnitudes, \(m_{xy}^{i}\) as in (1), that are present in the local patch of size 7 × 7. Whereas \(\mathbf {f}_{\textbf {3}}^{i}\) is the third per frame feature vector of Vi, and λ|.| denotes that minimum eigenvalue of flow patches of covariance matrix. There are K × L number of non-overlapping patches in a frame.

In the proposed model, the computation of first feature \(\mathbf {f}_{\textbf {1}}^{i}\) vector and second feature vector \(\mathbf {f}_{\textbf {2}}^{i}\) is same as Manasa et al. algorithm [13], but the computation of third feature \(\mathbf {f}_{\textbf {3}}^{i}\) differs. In Manasa et al. algorithm, it is stated that the perceivable distortion can be effectively characterized by the measure of dispersion of the features. The measure of dispersion of data is given in (5), where CV (.) denotes the coefficient of variation, z denotes the data vector, μz is the mean of z, σz denotes the standard deviation of z.

$$ CV(\textbf{z}) = \frac{\sigma_{\textbf{z}}}{\mu_{\textbf{z}}} $$
(6)

Once the \(CV(\mathbf {f}_{\textbf {1}}^{i})\) and \(CV(\mathbf {f}_{\textbf {2}}^{i})\) are computed and pooled, the difference in dispersion is computed as follows in (7).

$$ D(\textbf{z}_{r},\textbf{z}_{t}) = CV(\textbf{z}_{r})- CV(\textbf{z}_{t}) $$
(7)

Where zr is the reference data vector and zt is the test data vector. The definition in (7) is applied to \(\mathbf {f}_{\textbf {1}}^{i}\), \(\mathbf {f}_{\textbf {2}}^{i}\), i.e, we can be obtain \(D(\mathbf {f}_{\textbf {1r}}^{i},\mathbf {f}_{\textbf {1t}}^{i})\) and \(D(\mathbf {f}_{\textbf {2r}}^{i},\mathbf {f}_{\textbf {2t}}^{i})\) from (7).

We propose a different way of computing the third feature \(\mathbf {f}_{\textbf {3}}^{i}\), which effectively computes the randomness of the optical flow that effectively quantifies the temporal distortions. The proposed method of computing the \(\mathbf {f}_{\textbf {3}}^{i}\), is presented from (8)-(14).

$$ \mathbf{A}_{\textbf{1}}^{i}(x,y) = \left \{ \begin{aligned} &-1, && \text{if}\ \theta_{yx}^{i} < 0 \\ &1, && \text{otherwise} \end{aligned} \right. $$
(8)
$$ \mathbf{A}_{\textbf{2}}^{i}(x,y)= \lvert \theta_{yx}^{i} \rvert $$
(9)
$$ \mathbf{a}_{\textbf{1}}^{i} = [\sigma_{1}^{\mathbf{A}_{\textbf{1}}^{i}} , \sigma_{2}^{\mathbf{A}_{\textbf{1}}^{i}} , \sigma_{3}^{\mathbf{A}_{\textbf{1}}^{i}}, \dots, \sigma^{ \mathbf{A}_{\textbf{1}}^{i}}_{K\times L} ]^{T} , $$
(10)
$$ \mathbf{a}_{\textbf{2}}^{i} = [\sigma_{1}^{\mathbf{A}_{\textbf{2}}^{i}} , \sigma_{2}^{\mathbf{A}_{\textbf{2}}^{i}} ,\sigma_{3}^{\mathbf{A}_{\textbf{2}}^{i}} , \dots, \sigma^{ \mathbf{A}_{\textbf{2}}^{i}}_{K\times L} ]^{T} , $$
(11)
$$ \mathbf{\eta}_{\textbf{l}}^{i} = \left \{ \begin{aligned} &\sigma_{l}^{\mathbf{A}_{\textbf{1}}^{i}} \times \sigma_{l}^{\mathbf{A}_{\textbf{2}}^{i}} , && \text{if}\ \sigma_{l}^{\mathbf{A}_{\textbf{1}}^{i}} \neq 0 \wedge \sigma_{l}^{ \mathbf{A}_{\textbf{2}}^{i}} \neq 0 \\ &\mathbf{\psi}_{\textbf{l}}^{i}, && \text{otherwise} \end{aligned} \right. $$
(12)

where,

$$ \mathbf{\psi}_{\textbf{l}}^{i} = \left \{ \begin{aligned} &\sigma_{l}^{\mathbf{A}_{\textbf{1}}^{i}}, && \text{if}\ \sigma_{l}^{\mathbf{A}_{\textbf{2}}^{i}} = 0 \\ &\sigma_{l}^{\mathbf{A}_{\textbf{2}}^{i}}, && \text{otherwise} \end{aligned} \right. $$
(13)

for \(l = 1,2,3, \dots , K \times L \).

$$ \mathbf{f}_{\textbf{3}}^{i} = [{\eta_{1}^{i}} , {\eta_{2}^{i}} , {\eta_{3}^{i}}, \dots, \eta^{i}_{K\times L} ]^{T} , $$
(14)

The deviation in overall values of \(\mathbf {f}_{\textbf {3}}^{i}\) in (14) of the distorted patch from the reference patch is quantified as below in (15).

$$ C(\mathbf{f}^{i}_{3r},\mathbf{f}^{i}_{3t}) = 1-\text{corr}(\mathbf{f}^{i}_{3r},\mathbf{f}^{i}_{3t}) $$
(15)

where the subscript r denotes reference and t denotes test sets, and corr(x,y) is the correlation coefficient computed between two data vectors x and y. The pooling strategy and final video score computation is same as that of Manasa et al. algorithm [13]. Readers have been directed to refer [13] for more details on pooling strategy and final video score computation.

2.2 INT-CWC: Interquartile based comparative weighted closeness measure

We propose INT-CWC measure which aimed to measure the quantitative closeness of any two video quality assessment algorithms to be compared with DMOS. When majority of the video quality scores in the first algorithm are closer to DMOS when compare to second algorithm, the first algorithm performance is superior and similarly when majority of video quality scores in the second algorithm are closed to DMOS when compared to first algorithm, the second algorithm performance is superior.

$$ \text{Let},\mathbf{s}^{\textbf{d}}=[{s_{1}^{d}} , {s_{2}^{d}} , {s_{3}^{d}} , \dots, {s_{N}^{d}} ]^{T}, $$
$$ \mathbf{s}^{\textbf{m}_{\textbf{1}}}=[s_{1}^{m_{1}} , s_{2}^{m_{1}} , s_{3}^{m_{1}} , \dots, s_{N}^{m_{1}} ]^{T} \text{and} $$
$$ \mathbf{s}^{\textbf{m}_{\textbf{2}}}=[s_{1}^{m_{2}} , s_{2}^{m_{2}} , s_{3}^{m_{2}} , \dots, s_{N}^{m_{2}} ]^{T} $$

denotes the DMOS score, video quality scores of first algorithm and second algorithms respectively and N is the total number of video scores. The absolute difference between the DMOS scores and video quality scores of first algorithm are given in (16).

$$ \mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}=[{\Delta}_{1}^{m_{1}} , {\Delta}_{2}^{m_{1}} , {\Delta}_{3}^{m_{1}} , \dots, {\Delta}_{N}^{m_{1}} ]^{T} $$
(16)

where \({\Delta }_{i}^{m_{1}}= \lvert {s_{i}^{d}} - s_{i}^{m_{1}} \rvert \) and for \(i=1,2,3, \dots , N\). Similarly, the absolute difference between the DMOS scores and video quality scores of second algorithm are given in (17).

$$ \mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}=[{\Delta}_{1}^{m_{2}} , {\Delta}_{2}^{m_{2}} , {\Delta}_{3}^{m_{2}} , \dots, {\Delta}_{N}^{m_{2}} ]^{T} $$
(17)

where \({\Delta }_{i}^{m_{2}}= \lvert {s_{i}^{d}} - s_{i}^{m_{2}} \rvert \) and for \(i=1,2,3, \dots , N\). We propose to use upper whisker of interquartile range (IQR) statistical dispersion measure [27] in designing the INT-CWC measure. To compute the upper whisker, Q1 is computed as follows in (18).

$$ Q_{1}=\text{median}(\mathbf{\Delta}^{\textbf{z}}) $$
(18)

Let \(\mathbf {\Delta }^{\hat {\textbf {z}}}=[{\Delta }_{1}^{\hat {z}} , {\Delta }_{2}^{\hat {z}} , {\Delta }_{3}^{\hat {z}} , \dots , {\Delta }_{n}^{\hat {z}} ]^{T} \) denotes all the remaining terms in Δz which are greater than Q1. Then compute Q3 as in (19).

$$ Q_{3}=\text{median}(\mathbf{\Delta}^{\hat{\textbf{z}}}) $$
(19)

The upper whisker is given below in (20).

$$ I=Q_{3} + 1.5 \times IQR $$
(20)

where IQR = Q3Q1.

Let \(I_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\) be the upper whisker threshold for \(\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}\) which can be computed using (20) and \(I_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\) be the upper whisker threshold for \(\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}\) which can be computed using (20). We compute the updated Δz into \(\mathbf {\Delta }^{\textbf {z}_{1}}\) and compute Nz as below in (21), where Nz denotes the number of times z is updated and initially Nz = 0.

$$ (\mathbf{\Delta}^{\textbf{z}_{1}} (i), N_{\textbf{z}}) = \left \{ \begin{aligned} & (I, N_{\textbf{z}}=N_{\textbf{z}}+1), && \text{if}\ \mathbf{\Delta}^{\textbf{z}} (i)>I,\\ & && \text{for}\ i=1,2,3, \dots, N \\ &(\mathbf{\Delta}^{\textbf{z}} (i), N_{\textbf{z}}), && \text{otherwise} \end{aligned} \right. $$
(21)

Let \(N_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\) denotes the number of times \(\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}\) is updated using (21) and \(T_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\) be the summation of updated values of \(\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}\) computed using (21) and (22). Similarly, \(N_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\) be the number of times \(\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}\) is updated using (21) and \(T_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\) be the summation of updated values of \(\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}\) computed using (21) and (22).

$$ T_{\textbf{z}_{1}}= \sum\limits_{i=1}^{N} \mathbf{\Delta}^{\textbf{z}_{1}} (i) $$
(22)

Later compute the (\(n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}, n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\)) as in (23).

$$ (n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}, n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}})= \left \{ \begin{aligned} & n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}= n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}+1, && \text{if}\ {\Delta}_{i}^{m_{1}} < {\Delta}_{i}^{m_{2}},\\ & && \text{for}\ i=1,2,3, \dots, N \\ & n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}= n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}+1 && \text{if}\ {\Delta}_{i}^{m_{1}} > {\Delta}_{i}^{m_{2}},\\ & && \text{for}\ i=1,2,3, \dots, N \\ &(n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}= n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}+1,\\ & n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}= n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}+1), && \text{otherwise} \end{aligned} \right. $$
(23)

where \(n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\) and \(n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\) are initially assigned with 0.

Using \(n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\), \(N_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\) and \(T_{\mathbf {\Delta }^{\textbf {m}_{\textbf {1}}}}\), and \(n_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\), \(N_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\) and \(T_{\mathbf {\Delta }^{\textbf {m}_{\textbf {2}}}}\), we compute the S1 and S2 as follows in (24) and (26) respectively.

$$ S_{1} = \left \{ \begin{aligned} &X, && \text{if}\ n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}} \neq 0 \\ &T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}, && \text{otherwise} \end{aligned} \right. $$
(24)
$$ X = \left \{ \begin{aligned} &n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}} \times T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}} \times \frac{1}{N_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}} , && \text{if}\ N_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}} \neq 0 \\ &n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}} \times T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{1}}}}, && \text{otherwise} \end{aligned} \right. $$
(25)
$$ S_{2} = \left \{ \begin{aligned} &Y, && \text{if}\ n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}} \neq 0 \\ &T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}, && \text{otherwise} \end{aligned} \right. $$
(26)
$$ Y = \left \{ \begin{aligned} &n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}} \times T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}} \times \frac{1}{N_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}} , && \text{if}\ N_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}} \neq 0 \\ &n_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}} \times T_{\mathbf{\Delta}^{\textbf{m}_{\textbf{2}}}}, && \text{otherwise} \end{aligned} \right. $$
(27)

Using (24) and (26) we compute the final comparative scores as below in (28) and (29) for the first algorithm and second algorithm respectively.

$$ S_{\textbf{m}_{\textbf{1}}}=\frac{S_{1}}{S_{2}} $$
(28)
$$ S_{\textbf{m}_{\textbf{2}}}=\frac{S_{2}}{S_{1}} $$
(29)

We refer \(S_{\textbf {m}_{\textbf {1}}}\) and \(S_{\textbf {m}_{\textbf {2}}}\) as INT-CWC measures for first algorithm and second algorithms to be compared with DMOS.

3 Experimental results and discussion

In the proposed full reference video quality assessment algorithm, we used the LIVE video dataset [21, 22]. This database consists of ten original and 150 distorted videos. The distorted videos are generated from a set of original videos by using standard distortions like H.264 compression, MPEG-2 compression, transmission of H.264 compressed bit streams through error-prone IP networks and through error-prone wireless networks. These distortions are applied at different levels such that by using original video, 15 corresponding distorted videos are produced. Experimental results are shown in Table 1. We compare the experimental results with Manasa et al. algorithm FLOSIM [13]. We used FLOSIM code available at [3]. The proposed optical flow based full reference video quality assessment algorithm is implemented using the Gunnar Farneback optical flow estimation algorithm available in Matlab R2018a. The proposed video quality scores for Riverbed (rb), Tractor (tr), Sunflower (sf), Pedestrian Area (pa), Shields (sh), Rushhour (rh) are highly correlated with that of the FLOSIM scores.

Table 1 Correlation between FLOSIM and proposed algorithm

Further, from Fig. 4, we can observe that the video quality scores obtained for the video sequences Mobile and Calender(mc), Park run(pr) and Blue sky(bs) using the proposed algorithm are more nearer to the DMOS scores. Note that in Figs. 245678 and 9 the x-axis represents the test video sequences and y-axis represents video quality scores in terms of DMOS, FLOSIM, and proposed algorithms. The scatter plot of video quality scores for all the 150 video sequences can be observed in Fig. 2 and We can note that the video quality scores obtained using the proposed algorithm are much nearer to the DMOS scores. The video quality score profiles for video sequences with wireless distortions, IP distortions, H.264 compression, MPEG-2 compression can be seen in Fig. 7. And also all the individual video sequence video quality score profiles can be seen in Figs. 8 and 9.

Fig. 4
figure 4

Performance improvement

Fig. 5
figure 5

Comparison of Correlation Coefficient

Fig. 6
figure 6

Comparison of INT-CWC Score

Fig. 7
figure 7

FLOSIM and Proposed video quality scores profile compared to DMOS

Fig. 8
figure 8

FLOSIM and Proposed video quality scores profile compared to DMOS

Fig. 9
figure 9

FLOSIM and Proposed video quality scores profile compared to DMOS

From Fig. 4 and Table 2, we can observe that though the correlation coefficient is lower, for the video sequences Mobile and calendar, Park run and Blue sky(in Table 2), when compared to FLOSIM, the video quality scores obtained using the proposed algorithm are much closer to the DMOS scores. This motivates us to design the INT-CWC measure. From Table 2, Figs. 5 and 6 , we can observe that though the correlation coefficient is lower, the proposed algorithm out performs in terms of INT-CWC score when compared to the FLOSIM algorithm by Manasa et al.

Table 2 Closeness score to DMOS

4 Conclusion

Instead of using minimum eigenvalue to capture the randomness features, we used orientation feature of optical flow to improve the distortion computation. The proposed INT-CWC measure, which aimed to measure the quantitative closeness of any two video quality assessement algorithms to be compared with DMOS, is novel attempt. However, there is further scope to improve the orientation feature computation for better understanding the randomness.