1 Introduction

Advancements in devices for multimedia service access, thanks to smart phones, tablets, etc., and improvements in service infrastructures, result in an exponentially growing demand of high quality multimedia services. In particular, videos for real-time entertainment are the predominant source of traffic in the current Internet scenario and the trend shows a continuous growth [6]. It is expected that mobile traffic will increase by more than 80 % by 2020 compared to 2010 and will further increase by 175 % for the year 2025 compared to 2020 [33]. Moreover, the volume of video content delivery will be more than half of its total content volume, and demand of wireless video delivery will double by 2018 [5].

The delivery of the video from bandwidth-limited and error-prone networks with a required level of quality is a crucial challenge. Now, the trend is shifting the focus of quality assessment from compliance with system design goals to fulfillment of user needs or expectations [37]. In traditional network-centered approach, the quality has been measured in terms of Quality of Service (QoS), which is expressed in terms of key factors, such as packet loss or delay [20]. The QoS is incomplete to represent the users needs and expectations. Since few years, the term Quality of Experience (QoE) is widely used to represent the quality and is defined as:

  • the overall acceptability of an application or service, as perceived subjectively by the end-user. Notes: 1) Quality of Experience includes the complete end-to-end system effects (client, terminal, network, services infrastructure, etc). 2) Overall acceptability may be influenced by user expectations and context. [21];

  • the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the users personality and current state [27].

To improve the delivering of video services to the end user with high QoE, an efficient resource allocation/scheduling algorithm can be adopted. Most of the recently proposed resource allocation schemes are designed based on QoE models [45]. The performances of these methods can be improved by considering the transmission impairments and video content related parameters, and their impact on QoE [26, 39, 44]. With respect to these works, here we are not focusing on the effect of delaying control information or on the impact of improved resource allocation scheme; we propose to study the impact of network impairments and to understand the masking or enhancing effect of different content on the perceived quality. The analysis of impact of key transmission impairments and video content related parameters on QoE may help in the robust design and adaptation of multimedia infrastructure, service and applications.

To the best of our knowledge, in the state-of-the-art (briefly summarized in Section 2), the impact of network impairments (delay, jitter, PLR, bandwidth limitation) on the perceived quality is introduced. However, for this analysis a limited number of videos, and opinion scores collected by a limited number of subjects, are used. Moreover, limited work has been performed in understanding the impact of the video content on QoE, and the relation existing between transmission impairments and the impact of video content on the quality has not been analyzed.

This paper addresses the impact of delay, jitter, PLR, and bandwidth on the perceived video quality by considering video content and context information. Moreover, this paper analyzes the effectiveness of the parameters usually employed to characterize the video content such as Spatial perceptual Information (SI), Temporal perceptual Information (TI) [24], content motion, and data rate. In more details:

  • the impact of key transmission impairments (delay, jitter, Packet Loss Rate (PLR), and bandwidth limitation),

  • the influence of the video content related parameters (spatial-temporal information, video motion, and source data rate), and

  • the impact of the video content related parameters for different values of the impairments

are studied. To this aim, based on the purpose of the study, a recently proposed video quality database, ReTRiEVED [31], with large number of videos and opinion scores collected from a significant number of subjects, has been used. The rest of paper is organized as follows: Section 2 summarizes the state-of-the-art works related to this contribution, Section 3 briefly introduces the database, in Section 4 the data processing, the video content characterization tools and techniques are presented, results and comments are included in Section 5, and in Section 6 the conclusions are drawn.

2 Related works

The selection and consideration of the key quality influencing artifacts is an important step towards the robust video/image quality metric/QoE model design [4042]. In this context, the resource scheduling algorithm may be adapted based on the impact of transmission impairments on the perceived quality. As an example, in [44], the impact of delayed control information is considered during scheduling algorithm design to improve the performance of the algorithm. In article [39] the impact of packet loss is considered in adaptive video transmission scheme.

Moreover, by taking into account the video content related features, spatial-temporal perceptual information [9] and salient motion related features [8, 36], the prediction capability of the quality assessment metric might be enhanced. The consideration of bit rate, frame rate, resolution size, packet loss rate, video content features and screen size of terminal equipment, further enhance the prediction capability of the QoE metric [26]. Therefore, in-depth knowledge of the impact of video content and network impairments on QoE can support the definition of QoE models and scheduling algorithms.

To the best of our knowledge, the impact of the transmission impairments (mainly PLR and jitter) and encoding artifacts on QoE are largely investigated topics. In more details, jitter and jerkiness have been analyzed, for example, in [19]. The result shows that the QoE decreases logarithmically with the frame rate in presence of jerkiness and that the perceptual impact of jitter is highly content dependent. The H.264/AVC coding standard is very sensitive to the network disturbance, and the perceived quality quickly drops with nominal packet loss and packet delay variation [28]. The impact of transmission impairments for H.265/HEVC encoded video streaming considering the PLR as impairment is studied in [30]. In [7], it is demonstrated that jitter can degrade the video quality as much as packet loss and even the presence of low values of jitter or packet loss results in a severe degradation in the perceived video quality. Moreover, videos with low temporal information do not suffer as much as high temporal information for the same level of jitter. The impact of packet loss, latency and bandwidth is briefly presented in [3] and it is concluded that the packet loss is more important than latency to predict the subjective quality. The study of the impact of latency and jitter on the perceived quality is presented in [1] and it has been derived that the jitter has significant impact on the perceived quality. The perceptual and attentive impact of delay and jitter in multimedia delivery network is presented in [15] and the result shows that the delay and jitter significantly affect user QoE, and that content variation also affects the user satisfaction. Impact of the number of pauses, their duration and temporal location in TCP transmission protocol is studied in [34] and a new QoE metric for video streaming services is proposed. The network domain parameters: packet loss, delay, jitter, and packet reorder affect the video quality more than video content, though the effect of noise factors like motion, complexity and location also have a significant impact on the perceived quality [10, 11, 18, 31].

Furthermore, the authors of [29] and [2] show that different contents are differently affected by varying network performances. The authors in [32] present the influence of the source content and encoding configuration on the perceived quality for scalable video coding. Authors in [14] conclude that the content dependencies and visual attention also have significant influence on user experience, and the impact of codec and network parameters on QoE is dependent on the video content. Moreover, the QoE is also highly correlated with the users preference of content type [34, 43].

3 Video quality database

To analyze the impact of transmission impairments and video content on perceived quality, the availability of video quality databases is of crucial importance. In the state of art, many video quality databases have been proposed [13]. Most of them have been designed with the aim of analyzing the impact of encoding and packet loss artifacts. To this aim ReTRiEVED video quality database has been used. In the following the main peculiarities of ReTRiEVED are reported. The database has been created by considering eight heterogeneous SouRCe (SRC) videos with different context, color temperature, motion, etc. The basic features of the source video sequences are summarized in Table 1 and sample frames extracted from each video sequence are shown in Fig. 1.

Table 1 Details of SRCs (original video sequences) including Size, Frame Rate (FR), and Length
Fig. 1
figure 1

Sample frame for considered videos

The test video sequences, PVSs (Processed Video Sequences), are generated by streaming the original video sequences from a VideoLAN streaming server [38] (by using MPEG2 encoding @9000 Kbps and UDP protocol) through a noisy channel emulated by Network EMulator (NetEM) [16]. In the experimental set up, a single wired link has been used to stream the video from the server and at receiver side, the streamed video is saved in Transport Stream (TS) format and shown to the subject for their opinion. The effect of the delay has been emulated by introducing a set of five different delay amounts (100, 300, 500, 800, and 1000 ms) on each packet passing through the node. The effect of jitter has been added by introducing a fixed delay of 100 ms plus five variable delays (1, 2, 3 4, and 5 ms). The effect of the PLR has been introduced by randomly dropping the packet in an intermediate node with seven different PLR values (0.1, 0.4, 1, 3, 5, 7, and 10 %). Finally, the channel capacity is controlled for five different values of channel bandwidth (512 Kbps, 1 Mbps, 2 Mbps, 3 Mbps, and 5 Mbps) by using the token bucket filter.

As a consequence of the introduction of delay, each packet is uniformly delayed; however the quality of rendered video is not altered significantly, as shown in Fig. 2b. Moreover, Fig. 2c and d show that high values of jitter and PLR may produce broken blocks and repeat lines artifacts in the decoded video therefore significantly degrading the perceived quality. Similarly, as shown in Fig. 2e, severe bandwidth limitation may cause broken blocks, repeat lines, and false color artifacts.

Fig. 2
figure 2

Sample frames extracted from the Ice sequence showing the artifacts caused by different impairments

The selection of different values of impairments (as shown in Table 2) is based on ITU and ETSI recommendations [12, 23, 24]. At the receiver side, a VLC player (Version 2.1.3 with caching size of 300 ms) has been used. Accordingly, to analyze the effect of the impairments and the impact of video content, 184 different test videos (40 for delay, 40 for jitter, 56 for PLR, 40 for bandwidth and 8 MPEG2 encoded reference videos) with their corresponding subjective scores, collected from 41 subjects, have been considered. In more details, to study the impact of each impairments on the perceived quality, one impairment at a time has been considered. As an example, when studying the impact of delay variations, the impact of jitter, PLR, and bandwidth limitation has not be considered (i.e., the values of jitter and PLR are set to zero).

Table 2 Emulated transmission impairments

4 Data processing and video content characterization

This section covers the introduction of the data processing tools and video content characterization parameters.

4.1 Outlier detection and opinion score estimation

Outliers detection is the procedure that detects the subjects whose score strongly deviates from the mean behavior, that is showing a significant bias compared to the average behavior, and removes those observers from the analysis [22]. As described in ITU-R recommendation BT.500-13 [22], for each test sequences k, the mean \(\bar {x_{k}}\), standard deviation s k , and K u r t o s i s coefficient β2 k are computed. β2 k is given by:

$$ {\beta2_{k}} = \frac{m_{4}}{{m_{2}}^{2}}, $$
(1)
$${m_{x}}=\frac{\sum\limits_{i = 1}^{N} ({x_{ik}}-\bar x_{k})^{x}}{N}, $$

where, N is the number of subjects, x i k is the judgment given by the i th user for k th test video. For each observer i, find p i and q i .

That is, if 2 ≤ β2 k ≤ 4, then:

$$\left\{\begin{array}{lr} \textit{if~} (x_{ik} \geq \bar{x_{k}} + 2 {s_{k}}), & p_{i} = p_{i} + 1\\ \textit{if~} (x_{ik} \leq \bar{x_{k}} - 2 {s_{k}}), & q_{i} = q_{i} + 1 \end{array}\right. $$

else

$$\left\{\begin{array}{lr} \textit{if~} (x_{ik} \geq \bar{x_{k}} + \sqrt{20} {s_{k}}), & p_{i} = p_{i} + 1\\ \textit{if~} (x_{ik} \leq \bar{x_{k}} - \sqrt{20} {s_{k}}), & q_{i} = q_{i} + 1 \end{array}\right. $$

Finally for each subject, if \(\frac {p_{i}+q_{i}}{N}> 0.05\) and \(\frac {p_{i}-q_{i}}{p_{i}+q_{i}}<0.3\) then the observer i is rejected.

Based on the outliers detection procedure, the scores collected by 34 out of 41 subjects for the delay impairment, 36 out of 41 subjects for jitter artifact, 36 out of 41 subjects for PLR, and 23 out of 30 for bandwidth limitation were considered.

After the outliers have been removed, the perceived video quality has been measured in terms of Mean Opinion Score (MOS) [24]. The MOS represents the mean of collected opinion scores, i.e., of the values on a predefined scale that the subjects assign to their opinion on the video quality [25]. In the considered database, the single stimulus discrete five scale Absolute Category Rating (ACR) method has been used as assessment for obtaining subjective quality scores of test video sequences. Subjects rated the stimuli from one to five (shown in Table 3) according to the perceived quality. The MOS for the k th video is calculated by (2):

$$ MO{S_{k}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {x_{ik}}, $$
(2)

where, N is the number of subjects and \({x_{k}^{i}}\) is the judgment given by the i th user for k th video.

Table 3 Opinion score rating [22]

The 95 % Confidence Interval (CI) of the Sample Statistics (SS, in this case corresponding to the MOS score) is computed by:

$$ CI = SS \pm ME, $$
(3)

where ME is the Margin of Error.

Moreover, M E = C V × S E, where the Critical Value (CV) is computed from the t d i s t r i b u t i o n for 95 % of the CI and Standard Error (SE) is computed by:

$$ SE=St. Dev./\sqrt{n}, $$
(4)

the S t.D e v. is the standard deviation on the opinion score and n is the total number of the sample size, i.e., 41 subjects.

4.2 Video content characterization

To characterize the video content, Spatial perceptual Information (SI), Temporal perceptual Information (TI), video motion, and source data rate have been considered. The SI measures the amount of spatial details of each frame and it is higher for spatially complex scenes. In order to compute the SI, each video frame is first filtered with the S o b e l filter, then the standard deviation is computed and the maximum value in the frame is chosen to represent the spatial information content of the scene as in (5). The TI indicates the amount of temporal changes of a video sequence and it is higher for high motion sequences. The TI measurement is based on the motion difference feature as shown in (6):

$$ SI = \underset{time}{\max}\{ st{d_{space}}[Sobel({F_{n}})]\}, $$
(5)
$$ TI = \underset{time}{\max} \{ st{d_{space}}[{F_{n}}(i,j) - {F_{n - 1}}(i,j)]\}, $$
(6)

where, F n is the video frame at time n, s t d s p a c e is the standard deviation over the pixels for each filtered frame and \(\underset {time}{\max } \) is the maximum value in the considered time interval.

The video motion [35] is estimated by using a global motion coefficient, G, and it can be computed by (7)

$$ G = \frac{{{{\left| {E - M} \right|}_{ave}}}}{{\left| {1 - {M_{ave}}} \right|}}, $$
(7)

where, M and E are the mode and mean of the motion vector magnitudes (corresponding to two consecutive frames). M and E are computed as:

$$ M = {\text{mode}}_{\{ i = 1,2{\ldots} m\}}\left( {\sqrt {{{\left( {{M_{X(i)}}} \right)}^{2}} + {{\left( {{M_{y(i)}}} \right)}^{2}}} } \right), $$
(8)
$$ E = \frac{1}{m}\sum\limits_{i = 1}^{m} {\left( {\sqrt {{{\left( {{M_{X(i)}}} \right)}^{2}} + {{\left( {{M_{y(i)}}} \right)}^{2}}} } \right)}, $$
(9)

where, M X(i) and M Y(i) are the horizontal and vertical motion vector components of motion vector i and m is the number of motion vectors per frame.

5 Results and discussion

In the following, the impact of video content and key transmission impairments on perceived quality is presented. Firstly, the impact of the key impairments on the quality is briefly introduced. Secondly the impact of the video content related parameters TI, SI, global motion coefficient, and source data rate is presented and finally the impact of the video content related parameters for different values of impairments is analyzed.

5.1 Impact of key impairments

As expected, transmission impairments have a significant impact on the perceived quality [31]. The perceived quality decreases significantly for high values of jitter and PLR artifacts. Similarly, the perceived quality increases for high values of bandwidth and remains almost constant when the bandwidth becomes larger than 2 Mbps. However, the quality score is not affected by the delay impairment. A detailed analysis is presented in the following sections.

The results are analysed by using plots and the Analysis of Variance (ANOVA) test [17]. The one-way ANOVA test is used for comparing the means of two or more groups of data and determines whether any of those means are significantly different from each other or not. Particularly, it tests if the null hypothesis is accepted, that is the group means are equal. During the testing, test statistic is measured with the help of F-distribution (Fisher-Snedecor distribution), indicated as a F v a l u e , and if the probability (p v a l u e ) for the F-statistic is smaller than the significance level, then the test rejects the null hypothesis i.e. accept alternative hypothesis (at least one of the group means is significantly different from the others). In this paper, the widely used significance levels (α) are 0.01 and 0.05.

The analysis is based on two approaches:

  1. 1.

    Intra-video analysis: this analysis has been performed to understand the impact of the variation of the impairment level on the perceived quality of the same video. For each SRCs, large number of the test video sequences has been created by using different values of the impairments. For the analysis, these test videos and their corresponding MOS scores have been used.

  2. 2.

    Inter-video analysis: this analysis has been performed to understand the impact of the video content on the perceived video quality on the whole database. For same values of impairment, the videos (SRCs) and their corresponding MOS scores have been analyzed. During the analysis all the considered values of the impairments have been used.

5.1.1 Impact of delay on perceived quality

From Fig. 3a, it can be noticed that the MOS scores are neither increasing nor decreasing for the considered values of the delay artifact. In other words, the perceived quality is not influenced by the adopted values of the delay. This trend is confirmed for all the considered delay values. The result is further confirmed by the intra-video analysis, as shown in Table 4. Where for delay artifact, the p v a l u e is equal to 0.998, meaning that there is no strong evidence to reject the null hypothesis. It means that the MOS scores are not significantly different for all the considered video sequences for different values of the delay artifact. From these results, it can be concluded that the adopted values of the delay impairments do not influence the perceived video quality significantly. The motivation behind this behavior could be the fact that, in case of presence of transmission delay only (no packet loss or jitter) the video can be displayed smoothly with the help of a buffer [4].

Fig. 3
figure 3

The 95 % of confidence interval plot of MOS scores for different test videos for different values of transmission impairments

Table 4 Result of one-way ANOVA test used to analyze the impact of the impairments and their significance level on MOS (calculation is based on the MOS score for different values of the impairments)

Figure 3 also shows that for the same value of delay artifact, 100 ms, the confidence interval of the MOS scores is high, and the trend is confirmed by all the adopted values of the delay. Furthermore, Fig. 4 demonstrates that there is high variability on the MOS scores for the videos (a), (e), and (h) for delay value of 300 ms and the trend is confirmed by all the videos for all the considered delay values. These results show that the MOS scores of the considered videos are significantly different even for same values of delay artifact. In other words, the perceived quality varies significantly based on the video content itself rather than by delay artifact. This result is further confirmed by the result of the inter-video analysis as shown in Table 5. In this analysis, for the same value of delay, the different videos and their corresponding MOS scores have been analyzed. During the analysis all the considered values of the impairments have been used. The result, p v a l u e =0, suggests that the null hypothesis is rejected. That is, the perceived quality of the different videos is significantly different even for the same level amount of delay. These results show that the perceived quality is significantly different for different video content even for same level of the delay impairment. In other words, the perceived quality is more dependent on the video content rather than on the adopted values of the delay.

Fig. 4
figure 4

MOS score for the videos with different values of delay artifact at a confidence interval of 95 %

Table 5 Result of one-way ANOVA test used to analyze the impact of the video content for different transmission impairments (computation is based on the subjective score given to the videos with different values of the impairments)

5.1.2 Impact of jitter on perceived quality

Figures 3b and 5 demonstrate that high values of jitter result in a severe degradation of MOS score and the MOS scores becomes almost constant and minimum for high values of jitter (> 2 ms). The results of the intra-video analysis in Table 4, show that for jitter artifact the p v a l u e is equal to 0. It means that the MOS scores are significantly different for different values of jitter artifact to all the considered videos. These results show that the perceived quality is significantly decreased by the jitter artifact.

Fig. 5
figure 5

MOS score for the videos with different values of jitter artifact at a confidence interval of 95 %

Furthermore, from Fig. 5 it can be noticed that when the jitter is equal to 1 ms, the video (a) and (h) have significantly different MOS scores, and the result is equally applicable for all the videos and in particular for low values of jitter (< 2 ms). However, for high values of jitter (> 2 ms) the MOS scores are not remarkably different for all the videos. These results show that for small values of jitter artifact (< 2 ms) the perceived quality is also modified based on the video content. When the jitter artifact becomes high (> 2 ms) the perceived quality is mainly affected by the presence of impairments and the MOS scores are not significantly different for diverse video content. The result is further supported by the result of inter-video analysis as shown in Table 5. For jitter artifact the p v a l u e is equal to 0.992, indicating that if the video is impaired by the jitter artifacts the MOS scores are not significantly different for the different videos. In other words, if the video quality is distorted by the jitter artifact, the video content does not have a significant impact on perceived quality.

5.1.3 Impact of PLR on perceived quality

The impact of different values of PLR (Table 2) on the perceived video quality is presented in Figs. 3c and 6. The results show that high values of the PLR result in a severe degradation of the MOS scores. The result of the intra-video analysis (the impact of different values of the impairments to the same reference video) is shown in Table 4. For the PLR impairment, the p v a l u e is equal to 0, and this indicates that the MOS scores are significantly different for the considered PLR values for all the videos. These results conclude that the perceived quality degrades significantly for all the considered videos for high values of the PLR.

Fig. 6
figure 6

MOS score for the videos with different values of PLR artifact at a confidence interval of 95 %

Figure 3c also shows that the confidence interval of the MOS scores for considered video sequences is high at PLR is 0.1 %, and the confidence interval becomes small for high values of the PLR. Moreover, Fig. 6 also demonstrates that for 0.1 % of PLR, the videos (a), (f), and (h) present noticeable differences in the MOS scores and this trend is equally true for all the videos for low values (< 5 %) of PLR. These results indicate that if the video is not significantly influenced by the PLR artifact, the perceived quality also depends on the video content. However high values of PLR result in a severe degradation of the perceived quality independently on the video content. Moreover the result of the inter-video analysis (the impact of the video content for different values of the impairment) is shown in Table 5. For the PLR, p v a l u e is equal to 0.9897. This means that if video quality is already affected by high values of PLR, then the MOS scores of the different video sequences for different values of the PLR artifact are not significantly different.

5.1.4 Impact of bandwidth limitation on perceived quality

The impact of the bandwidth limitation on MOS scores is presented in Figs. 3d and 7. The result shows that, for MPEG2 encoded video sequences, the MOS score increases almost linearly with the bandwidth. After the bandwidth becomes greater than 2 Mbps, the MOS scores are almost constant for all the videos. Moreover, the result of the intra-video analysis is presented in Table 4. From the table, for bandwidth limitation, the p v a l u e is equal to 0 and the result suggests that the MOS scores are significantly different for different values of the bandwidth for all the considered video sequences. From these results, it can be concluded that high values of the bandwidth result in significantly better quality.

Fig. 7
figure 7

MOS score for the videos with different levels of bandwidth limitation at a confidence interval of 95 %

Furthermore, from Fig. 3d, it can be noticed that for high values of the bandwidth value the confidence interval of the MOS scores for different videos is large. Moreover, Fig. 7 demonstrates that for low values of bandwidth (<2 Mbps) the test videos result in similar MOS scores. For high values of the bandwidth (> 2 Mbps) the videos (a), (e) and (h) have significantly different MOS scores and the trend is equally applicable for all the video sequences. These results show that for high values of the bandwidth, the perceived quality is also significantly different for different video content. Finally, the result of the inter-video analysis is shown in Table 5. For the bandwidth limitation, p v a l u e is equal to 0.93; it means that the perceived quality is different for different videos content for high values of the bandwidth.

From the results demonstrated in this Section, it can be concluded that when the impact of the impairments is high (high value of jitter and PLR and low value of bandwidth) the perceived quality is mainly driven by the presence of impairments and the video content does not have a significant impact on perceived quality. When the impact of the impairments is low (for high values of bandwidth, low values of jitter and PLR and for all the considered values of delay impairment) the perceived quality is significantly different for different videos. In other words, for low impact of the impairments the perceived quality is influenced by the video content. Therefore, the study of the impact of the video content for different transmission impairments is of crucial importance.

5.2 Impact of video content

To understand the impact of video content on the perceived video quality, the set of eight MPEG2 compressed reference (without transmission impairments consideration) videos and their corresponding subjective opinion scores collected from the 41 subjects is used. For each video, the mean opinion score and the corresponding 95 % of confidence interval is computed. From Fig. 8, it can be noticed that the MOS scores vary for the considered set of reference videos and that the 95 % of confidence interval is also significantly large. Moreover, the impact of video content on perceived quality is further analyzed by using the ANOVA test and the result (p v a l u e is equal to 0) shows that, the null hypothesis is to be rejected. It confirms that the perceived quality is significantly influenced by the video content. Therefore, in the following the impact of the video content characterization parameters: SI, TI, motion and source rate on perceived quality have been analyzed.

Fig. 8
figure 8

MOS score of the reference videos at a confidence interval of 95 %.

From Fig. 9a, b, c, and d we can notice that the MOS scores are not linearly related with SI, TI, motion and source rate and, consequently, perceived quality is not linearly related with these parameters. However, the perceived quality changes significantly for different video contents. For this reason in the following section the impact of the video content on perceived quality for different values of the impairment are studied.

Fig. 9
figure 9

MOS scores with 95 % confidence interval for the videos with different values of SI, TI, motion, and source data (where according to source data rate the videos have been grouped into Low Rate (LR) and High Rate (HR) category)

5.3 Impact of the video content on perceived quality for different values of the impairments

The impact of the SI, TI, motion, and data rate on perceived quality for different values of the transmission artifacts, delay, jitter, PLR, and bandwidth will be further analyzed in the following sections.

5.3.1 Impact of SI

Figure 9a shows the MOS scores with deviation on the scores at 95 % of confidence interval for the SRCs with their corresponding SI. It can be noticed that the MOS score does not have any clear increasing/decreasing pattern for the video characterized by high SI. From this fact, we can conclude that the perceived quality is not linearly related to the SI of the SRCs. However, the videos with different SI have significantly different MOS scores. Therefore, in the following the impacts of SI on perceived quality for different values of the impairments (delay, jitter, PLR, and bandwidth) will be presented.

Delay

Figure 10a shows the MOS scores of the videos with different spatial information at different values of the delay. From Fig. 10a it can be noticed that the MOS score does not change significantly for high values of the delay. The videos have a similar pattern as the reference video sequence (as shown in Fig. 9a) for all the considered values of the delay artifact. In other words, there is no clear pattern on the MOS scores for the videos with high SI for any values of the delay artifact. From this fact, we can conclude that the impact of SI on the perceived quality is not influenced by the considered values of the delay artifact.

Fig. 10
figure 10

Impact of SI on perceived quality for different values of delay and jitter impairments

Jitter

The impact of different values of the jitter artifact on the perceived quality for the videos with high values of SI is plotted in Fig. 10b. The plot shows that the MOS score degrades significantly for high values (> 2 ms) of jitter to all the considered videos. For jitter value of 1 ms, the MOS score is high and significantly different for all the videos. However, there is no clear increasing/decreasing pattern for the videos with high SI and the pattern is similar to reference video sequence (Fig. 9a). For the video which has been impaired by high values jitter (> 2 ms) the MOS scores are minimum and slightly increases for the videos with high SI. The results show that for low values of the jitter, the impact of the SI on perceived quality is not evident. However, if the channel is more influenced by high value of jitter, the perceived quality is slightly increased for high spatial complex scenes i.e. high SI values, but the level of the increment is very small and is not linear.

PLR

From Fig. 11a, it can be noticed that as for the jitter artifact, for low values of PLR (0.1 and 0.4 %), the MOS scores have high values and also have a similar pattern as the reference video (Fig. 9a). However, for high values of PLR artifacts the MOS scores are low and also do not show any clear pattern with high SI values. The results show that the impact of SI does not change for low values of PLR. On the other hand for large values of PLR, the impairment strongly dominates the impact of SI.

Fig. 11
figure 11

Impact of SI on perceived quality for different values of PLR and bandwidth limitation

Bandwidth limitation

Figure 11b shows that, for low values of bandwidth (< 1 Mbps) the MOS scores slightly increase for the videos with high SI. However, when the bandwidth gets larger than 2 Mbps, the trend is inverted and the MOS score decreases for videos with high SI. Moreover, for high values of bandwidth the MOS score has a similar pattern as for reference videos (Fig. 9a). Based on these results, we can conclude that for low values of bandwidth, the perceived quality is poor and slightly increases for videos with high values of SI. However, for channels characterized high bandwidth values, the impact of SI on perceived quality does not change significantly.

From the results and discussion presented in Section 5.3.1, we can conclude that the perceived quality is significantly influenced by the jitter, PLR and bandwidth for all the videos with different SI. The impact of the SI on the perceived quality is not significantly modified for all considered values of delay, high values of bandwidth and low values of jitter and PLR. For high values of jitter and low values of bandwidth, the perceived quality slightly improves for the videos with high SI.

5.3.2 Impact of TI

Figure 9b shows the MOS score of different videos with their corresponding TI. From the figure, it can be noticed that the MOS score does not have any linear relationship with the videos with high TI values. From this fact, we can conclude that there is no clear increasing/decreasing pattern on the perceived quality for videos with high TI. However, the perceived quality is significantly different for the videos with high TI. Therefore, in the following subsections the impact of TI on perceived quality for different values of the impairments are presented.

Delay

Impact of the video TI for different values of delay artifact is plotted in Fig. 12a. The plot shows that the MOS scores do not have any clear increasing or decreasing pattern for the video with high TI value for all the considered values of the delay artifact (Table 2). Though the different values of the delay artifacts result in slightly different MOS scores and there is no clear pattern for the videos with high TI. The overall MOS pattern for the videos with high TI is similar to those of the reference video sequence (shown in Fig. 9b). From this fact, we can conclude that the presence of delay does not modify the impact of TI on perceived quality.

Fig. 12
figure 12

Impact of TI on perceived quality for different values of delay and jitter

Jitter

Figure 12b shows that for low values of jitter (< 2 ms) the MOS scores are significantly different for the different videos. However there is no clear increasing or decreasing pattern of the MOS score for the videos with high TI, but the pattern is similar to the ones of the reference video (shown in Fig. 9b). If the channel is affected by high values (> 3 ms) of jitter, the MOS scores are minimum and do not vary significantly for videos with different values of TI. From the performed analysis, it can be concluded that the impact of TI on perceived quality does not change for low values of jitter and that for high jitter values the impact of TI is almost negligible.

PLR

Figure 13a demonstrates the plot of MOS score for the videos with different TI for different values of the PLR artifact. The plot shows that for low values (< 1 %) of PLR, the MOS scores change significantly for the videos with different TI values. However, there is not a clear increasing/decreasing pattern that can relate the TI and the MOS values for the considered PLR values. For high values (> 3 %) of PLR the MOS scores are very small, independently from the TI of the video. It means that when the impact of PLR becomes effective the change in MOS is no more highly correlated with TI. In other words, the perceived quality is significantly different for the videos with diverse TI, especially for low values of PLR. However, there is no direct relationship between perceived quality and TI. For high value of PLR the perceived quality is minimum, independently from TI of the videos.

Fig. 13
figure 13

Impact of TI on perceived quality for different values of PLR and bandwidth limitation

Bandwidth limitation

From Fig. 13b, it can be noticed that the MOS score increases for high values of the bandwidth for all the videos that are characterized by a wide span of TI. Moreover, for high values of the bandwidth the MOS scores have similar pattern to the ones of the reference video (shown in Fig. 9b). Anyway, for low values of bandwidth the MOS score is small for all the videos with different TI. This result implies that the impact of TI on perceived quality is not influenced for high values of bandwidth while the impact of TI becomes minimum for low values of bandwidth.

From the results of Section 5.3.2, it is clear that the perceived quality is not linearly related with the TI of the videos for any considered transmission impairment. In other words, the quality is significantly influenced by the jitter, PLR and bandwidth and not by the delay impairments for the videos, independently by the temporal information of the videos. Furthermore, the impact of TI is not affected by low values of PLR and jitter, all considered values of delay, and high value of the bandwidth. For high values of jitter and PLR, and low value of bandwidth, the perceived quality is minimum and independent from the TI of the video.

5.3.3 Impact of video motion

Video motion has been characterized by using the global motion coefficient. Figure 9c shows the MOS scores with their corresponding 95 % confidence interval, for videos with diverse motion coefficient. The figure demonstrates that the MOS scores have not an evident relationship for high values of motion. From this fact, we can conclude that the perceived quality is not directly/linearly related with video motion. Moreover, the videos with different motion coefficient have significant different MOS scores. Therefore, in the following the impact of video motion on perceived quality for different values of transmission impairments are presented.

Delay

The impact of the video motion, expressed as a motion coefficient values, on perceived quality for different values of delay impairments is plotted in Fig. 14a. The plot shows that all the considered videos have high MOS scores for the considered values of the delay artifact. Moreover, the MOS scores for the videos (with different motion coefficients) for the considered values of delay artifact have a similar pattern to the ones of the reference videos, as shown in Fig. 9c. However, the videos for different delay values have diverse MOS score, but the scores do not have any clear relationship with the delay values for the video with high motion coefficient. The results show that the impact of the video motion on perceived quality is not significantly influenced by the considered value of the delay artifact.

Fig. 14
figure 14

Impact of motion on perceived quality for different values of delay and jitter

Jitter

The impact of the video motion on perceived quality for different values of jitter artifact is plotted in Fig. 14b. The figure demonstrates that for low values (< 2 ms) of jitter, the videos with different motion coefficients have significant different MOS scores. However there is no clear increasing/decreasing pattern on the MOS score for the videos with their motion coefficients. On the other hand, for high values (> 2 ms) of jitter, the videos have almost similar MOS score and that is minimum, independently on the motion coefficient of the video. For further analysis the source videos have been grouped into low motion (motion coefficient < 3) and high motion (motion coefficient > 3) categories based on the global motion coefficient and then the impact of the jitter artifact has been analyzed. The results, shown in Fig. 15a, let us conclude that the increment or decrements of the MOS score are not strictly related with motion (high or low) of the videos for all the considered levels of the jitter impairment. The result show that high values of jitter have a significant destructive impact on perceived quality. However, the impact of the video motion on perceived quality does not alter for low value of the jitter artifact. Moreover, the impact of jitter does not depend on the video motion.

Fig. 15
figure 15

a Impact of jitter values for low motion and high motion videos, b Impact of motion on perceived quality for different values of jitter and PLR

PLR

Figure 15b show the impact of video motion coefficients on perceived quality for different values of the PLR artifact. The plot shows that, for high PLR values the videos have low MOS scores independently from their motion. From these results it can be concluded that for high values of PLR, the perceived quality is significantly influenced by the PLR variation, independently from the motion of the video. For very small amount of packet loss (< 1 %), the videos with different motion have high MOS scores and the scores are significantly different. However, there is no clear increasing/decreasing pattern on the MOS scores for the videos with high motion coefficient.

Bandwidth limitation

Figure 16a, demonstrates that for high values (> 2 Mbps) of bandwidth the videos have significantly different MOS scores for the considered videos. At low value of bandwidth, the MOS score is very small and does not follow a specific rule for videos with high motion. From these results, it can then be concluded that the impact of video motion on perceived quality is not significantly influenced by the channel capacity; especially for high values (> 2 Mbps).

Fig. 16
figure 16

a Impact of motion on perceived quality for different values bandwidth limitation, b MOS scores for the videos with different source data rates for different values of bandwidth limitation

From the results presented in Section 5.3.3, we can conclude that the video motion does not have a direct or linear relationship with perceived quality. Moreover, for low values of jitter and PLR, all values of delay and high values of bandwidth, the videos with different motion have significantly different perceived quality. For high values of PLR and jitter and low values of bandwidth, the perceived quality is poor and does not depend significantly on the video motion.

5.4 Impact of source rate

To analyze the impact of the source data rate on perceived video quality, the eight SRCs have been categorized into three groups; low data rate (less than 9Mbps), medium data rate, and high data rate (more than 14 Mbps) and their impact on perceived quality has been analyzed. From Fig. 16b it can be noticed that the MOS score increases for every increase in bandwidth and when the channel bandwidth becomes larger than 2 Mbps, the MOS score does not change significantly. This could be due to the fact that the test videos have been created by streaming SRCs through a noisy channel by using MPEG2 encoding. Moreover, it is also noted that the MOS scores are not significantly different based on the data rate of the SRCs. In other words, the quality does not have a linear/direct relationship with source rate even it has been streamed from a channel with different bandwidth. From these facts, we can conclude that the perceived quality is not influenced by the source data rate even if we consider channels with different bandwidth limitations.

5.5 Remarks

This section shows the relationship between the impact of the different impairment levels and video content related parameters on the perceived video quality. In more details, in presence of low levels of impairments artifacts (considering delay values, high values of bandwidth and low values of jitter and PLR), the impact of the content related parameters SI, TI, and motion on the perceived quality is not significantly different from the trend analyzed in absence of impairments. For high level of impairments artifact (high values of jitter and low values of bandwidth) the perceived video quality is poor, independently from the values of the content related parameters.

These results can be used in the design of QoE based scheduling algorithms. In fact, if the transmission channel is affected by high levels of transmission impairments, the optimization algorithm should focus more on the improvement of the network state since the perceived quality is poor. While, when the channel is not significantly affected by the transmission impairments, the QoE can be improved or not depending on the particular content. The results achieved with our experiments can be used as a guide for the optimization process.

6 Conclusions

In this article the impact of video content on perceived quality for different key transmission impairments has been presented. To this aim, the ReTRiEVED video quality test database has been used. From the performed analysis some concluding remarks can be drawn:

  1. i)

    jitter, PLR and bandwidth limitation have significant impact on the quality, whereas the delay does not show a significant impact on the perceived quality. When the impact of the impairments on quality is low, the quality is mainly influenced by the video content;

  2. ii)

    the video content has a significant impact on the perceived quality. However, the content related parameters usually employed for characterizing video content (spatial-temporal perceptual information, video motion, and source rate), do not show an evident relationship with perceived video quality.

In more details:

  • the impact of the SI on the perceived quality is not significant for all the considered values of delay, high values of bandwidth and low values of jitter and PLR. For high values of jitter and low values of bandwidth, the perceived quality slightly improves for the videos with high SI;

  • the impact of TI and motion on perceived quality is not affected by low values of PLR and jitter, all considered values of delay, and high values of bandwidth. For high values of jitter and PLR, and low values of bandwidth, the perceived quality is minimum and independent from the TI and motion of the video;

  • the perceived quality is not influenced by the data rate even the videos (MPEG2 encoded) have been streamed through the channels with different bandwidth limitations.

The results of this paper can be exploited for the optimization and the design of robust multimedia communication networks, services and applications, video QoE metric design and resource-scheduling strategies development, etc.

As a future work, the impact of the impairments and video content on the perceived quality is further investigated by performing a new subjective experiment with more number of reference videos and new video encoding standards, MPEG4, H.264/AVC, and HEVC.