Keywords

1 Introduction

Virtual reality (VR) has become an eye-catching topic in recent years due to the rapid development of industry ecosystem and technology. The omnidirectional images and videos in VR can provide viewers with immersive experience where viewers can explore every viewing direction freely by wearing the head mounted display (HMD). When Facebook Surround 360, a high-quality 3D-360 video capture system, becomes available in 2016, it is possible to capture and render stereoscopic omnidirectional images. However, owing to the limitation of photographic apparatus, transmission bandwidth, display devices, etc., the overall quality of experience (QoE) of stereoscopic omnidirectional contents is far from satisfactory. It’s worth noting that the overall QoE considers both the factors of image quality and depth perception in this paper. Thus, in order to generate high-quality and realistic stereoscopic omnidirectional images, it is desirable and urgent to evaluate the image quality, depth perception, and overall QoE of these images.

Most existing subjective quality assessment databases are based on 2D omnidirectional contents and only image quality is taken into account. An omnidirectional image quality assessment (OIQA) database [3] has been built for subjective quality evaluation study. Xu et al. [11] propose a subjective visual quality assessment method for omnidirectional videos. Zhou et al. [15] explore the impact of spatial resolution on the perceptual quality of immersive 360-degree images. IVQAD 2017 [4] is a database built for immersive video quality assessment.

Different from 2D omnidirectional images, the overall QoE of stereoscopic omnidirectional images concerns multiple aspects due to the extra dimensionality of immersive contents. Specially, the additional dimension of depth may implicitly affect the experience of viewing. Thus the image quality and depth perception are two important factors which should be taken into consideration when evaluating the subjective perception of QoE. However, to the best of our knowledge, there is no available stereoscopic omnidirectional image quality assessment (IQA) database for investigating the property of these QoE factors.

Fig. 1.
figure 1

Test image contents used in our database.

In this paper, we establish the very first Stereoscopic OmnidirectionaL Image quality assessment Database (SOLID) by evaluating the image quality, depth perception, and overall QoE of stereoscopic omnidirectional images. The relationships among these three factors are analyzed in our database. We find that the subjective rating scores of overall QoE are highly correlated with image quality and it is moderately correlated with depth perception. Besides, the subjective ratings of image quality are also correlated with depth perception scores. Finally, several well-known objective IQA metrics are tested on our database. Although these classic metrics can achieve a promising performance on predicting image quality, it is still a challenge to predict the overall QoE for stereoscopic omnidirectional images.

The rest of the paper is organized as follows. Section 2 describes the details of our SOLID database. Section 3 analyzes the subjective rating scores to investigate the relationship of multidimensional rating scores and finally several objective IQA metrics are evaluated. Section 4 concludes the paper.

2 Stereoscopic Omnidirectional Image Quality Assessment Database

This section introduces the experiment of subjective quality evaluation for stereoscopic omnidirectional images, including the dataset and subjective test methods.

2.1 Image Database

The test images used in our experiment are captured by Facebook Surround 360 which is an open source hardware and software for generating stereoscopic omnidirectional images and videos. It can render the stereoscopic omnidirectional images (equirectangular projection) with the resolution of \(8192\times 8192\) and the file format of PNG. Besides, the disparity can be adjusted in the software of Surround 360 before rendering stereoscopic omnidirectional images, which is used to render images with different depth perception levels in our database.

Fig. 2.
figure 2

Spatial information (SI) distribution for the test images used in our database.

There are 6 high-quality reference stereoscopic omnidirectional images with the resolution of \(8192\times 8192\) in our database, which are shown in Fig. 1. The stereoscopic omnidirectional images are in the format of top-and-bottom that left and right view images are packed vertically and there exists disparity between left view and right view. It can be seen that there are both indoor scenes and outdoor scenes in our database. Besides, the spatial information (SI) which represents the spatial complexity of an image is taken into account when selecting the reference images. Figure 2 shows SI of the six reference images used in our experiment.

Table 1. Distortion level and depth level settings for test images in our database.

To investigate the relationship between image quality, depth perception, and overall QoE, the stimuli are generated to cover a wide range of image quality and depth perception. The distortion levels for each reference stereoscopic omnidirectional image are presented in Table 1. The hypothetic reference circuits (HRCs) are used to represent the test stereoscopic omnidirectional image with certain distortion and disparity. As shown in Table 1, there are 26 test images for each reference stereoscopic omnidirectional image. Thus, we have 156 test stereoscopic omnidirectional images in our database generated from 6 reference stereoscopic omnidirectional images.

As inspired by [16], there are 3 depth levels for each reference stereoscopic omnidirectional image in our experiment: (1) zero disparity images where there is no disparity between left and right view images, (2) medium disparity images, and (3) large disparity images.

To simulate the quality degradation, each reference image is compressed into Better Portable Graphics (BPG) format [1] with different quantization parameters (QPs). BPG is a new image format which aims to replace the JPEG image format when quality or file size is an issue. Considering the stereoscopic omnidirectional images are usually large in file size, we believe BPG format may be popular in this kind of contents and thus we choose BPG compression distortion in our experiment. The reference stereoscopic omnidirectional images are distorted either symmetrically or asymmetrically in our database. The symmetrical and asymmetrical distortion is determined according to whether the left and right view images are distorted with the same distortion level.

2.2 Subjective Test Methods

In our experiment, image quality, depth perception, and overall QoE are evaluated by the subjects. The experiment is performed according to Absolute Category Rating with Hidden Reference (ACR-HR) which is described in [5]. ACH-HR is a single stimulus evaluation and voting is performed after each viewing. Images are assessed using the five-grade scale with following levels: “5 - Excellent”, “4 - Good”, “3 - Fair”, “2 - Poor”, and “1 - Bad”.

Fig. 3.
figure 3

MOS histogram for image quality, depth perception, and overall QoE in our database.

The following equipment is used in our experiment. The Samsung Gear VR, which is a kind of head-mounted display, is used as the virtual reality display system. The field of view provided by Gear VR is 101 degrees. Samsung Galaxy S9+ is used to display the images with a resolution of \(2560\times 1440\).

18 non-expert student subjects aged from 21 to 28 years old take part in our subjective test. All participants pass the visual acuity, color vision (Ishihara charts) and stereo acuity tests (RANDOT test). Before the formal test, there is a training session that each subject is explained the purpose of the evaluation. Also, they are shown the examples of different levels of compression artifacts and depth perception. In the formal test, there are 156 images including reference and distorted images to be rated randomly. There is no time limitation for subjects when they are watching stereoscopic omnidirectional images. During the experiment, subjects can take a break as long as they feel tired or uncomfortable to avoid eye fatigue.

3 Analysis of Subjective Database

This section provides a detailed analysis of the subjective evaluating scores. First, the suitability of evaluation methods is analyzed to ensure that data collected from subjects is effective. Second, we explore the relationship between image quality, depth perception, and overall QoE. Finally, performance evaluation of some objective metrics is performed on our database.

3.1 Data Analysis

The subjects whose correlation coefficient with average image quality is lower than 0.8 or with average depth perception is lower than 0.65 are considered as outliers and their subjective evaluating scores are removed from our database. There remain 16 valid subjects (7 males and 9 females) after outlier removal.

Table 2. 95% confidence interval (CI95) of the image quality, depth perception, and overall QoE.
Table 3. Correlation between image quality, depth perception, and overall QoE.

Mean Opinion Score (MOS) values are computed for each test image in the database by averaging scores of valid subjects. According to ITU-T P.910 [5], the 95% confidence interval (CI) of the subjective rating scores for image quality, depth perception, and overall QoE are given in Table 2. Results show that all the subjects reach a reasonable agreement on the perceived image quality, depth perception, and overall QoE. Figure 3 shows the MOS histogram for image quality, depth perception and overall QoE in our database, respectively. In Fig. 3, the MOS values mainly centralize among score 2, 3, and 4.

The MOS values of 6 scenes for 26 HRCs are averaged and shown in Fig. 4. We can draw some conclusions from Fig. 4. First, the BPG compression will affect image quality and overall QoE greatly. Second, depth perception is dominated by the disparity and the BPG compression has a moderate influence on depth perception. Third, overall QoE is affected by image quality and depth perception jointly. The overall QoE scores are consistent with the image quality scores and the strong depth perception tends to enhance the overall QoE while the poor depth perception will reduce the overall QoE. This can be observed from the MOS values of HRCs in Fig. 4, such as HRC 1 and 26. The linear correlation coefficients between image quality, depth perception, and overall QoE are presented in Table 3, which demonstrates that image quality is the dominant factor for overall QoE compared with depth perception.

Fig. 4.
figure 4

MOS of image quality, depth perception, and overall QoE for each HRC.

3.2 Key Factors to Image Quality and Depth Perception

The relationship between image quality and distortion levels is shown in Fig. 5. We find that under the same distortion level, images with large disparity tend to have higher image quality scores and this is especially obvious in asymmetrical distortion QP set (Fig. 5 (b)). According to the questionnaire from the subjects, stereoscopic omnidirectional images with strong depth perception can provide a more realistic environment compared to those with weak depth perception. As a result, subjects tend to rate a high score for image quality of these images.

Fig. 5.
figure 5

The influence of distortion levels to image quality. (a) Image quality of symmetrically compressed images at different depth perception levels; (b) Image quality of asymmetrically compressed image at different depth perception levels.

Fig. 6.
figure 6

Relationship between image quality MOS and depth perception MOS.

Figure 6 is the scatter plot of subjective rating scores of image quality vs. that of depth perception. It can be observed that subjective image quality scores are highly correlated with depth perception scores. This is an interesting finding because image quality and depth perception are usually considered as two different perceptual dimensions. Two possible explanations are found from the data analysis and questionnaire. First, image quality degradation will result in blurry objects, which will weaken the depth perception. Second, subjects tend to give high images quality scores when depth perception of that image is strong, and vice versa [8]. Thus image quality is highly correlated with depth perception.

3.3 Performance Evaluation of Objective Metrics

We tested nine well-known objective image quality assessment metrics on our database including six 2D IQA metrics PSNR, SSIM [10], MSSSIM [9], FSIM [14], VSI [13] and BRISQUE [6], two 2D omnidirectional IQA metrics S-PSNR [12] and WS-PSNR [7], and a 3D IQA metric [2]. For these 2D metrics, the predicted image quality of left and right view images are averaged as the final quality of stereoscopic omnidirectional images. Linear correlation coefficient (LCC) and Spearman’s rank correlation coefficient (SROCC) are used to measure the performance of objective metrics. The higher correlation coefficient means better correlation with human subjective quality judgement.

Table 4. LCC and SROCC performance of objective image quality metrics on our database.

The performance of the above metrics is shown in Table 4 and we can find that FSIM has the best performance while 2D OIQA metrics do not show thier advantages on omnidirectional images compared with traditional 2D IQA metrics. Existing metrics can achieve promising performance for image quality, however the performance drops significantly when it refers to overall QoE. Results in Table 4 demonstrate that predicting overall QoE is still a challenge since it considers not only the image quality but also other factors, such as depth perception, visual comfort, etc.

4 Conclusion

In this paper, we build a Stereoscopic OmnidirectionaL Image quality assessment Database named as SOLID. There are 156 test images with different levels of depth and BPG compression artifacts in SOLID. To the best of our knowledge, it is the very first work on stereoscopic omnidirectional image quality assessment database, which considers the factors of image quality, depth perception, and overall QoE. By analyzing the subjective rating scores, we find that image quality is the dominant factor for overall QoE compared to depth perception. Besides, image quality is also highly correlated with depth perception that images with large disparity tend to achieve higher image quality scores. Finally, some existing well-known objective IQA metrics are tested on our SOLID database. Experimental results show that overall QoE assessment is more challenging compared to IQA in terms of stereoscopic omnidirectional images.