360 $$^\circ $$ User-Generated Videos: Current Research and Future Trends

Priyadharshini, S.; Mahapatra, Ansuman

doi:10.1007/978-981-15-6844-2_9

S. Priyadharshini⁴ &
Ansuman Mahapatra⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 913))

416 Accesses
2 Citations

Abstract

The 360$^\circ $ video, also known as immersive or spherical video, allows the observer to have a 360$^\circ $ view and an immersive experience of the surroundings. Each direction in this video is recorded at the same time either by an omni-direction camera or by an assembly of cameras synchronized together. The viewing perspectives are controlled by the viewer during playbacks. This article gives an overview of the existing research areas and methods in the user-generated 360$^\circ $ videos for streaming, transcoding, viewport-based projections, video standardization, and summarization. This survey also provides an analysis of the experience estimation in 360$^\circ $ videos. The study of multiple quality evaluation criteria is also reviewed. Moreover, 360$^\circ $ video user experience studies are also focused on this survey. The merits and demerits of each technique are investigated in depth.

Access provided by Autonomous University of Puebla. Download chapter PDF

Augmented ODV: Web-Driven Annotation and Interactivity Enhancement of 360 Degree Video in Both 2D and 3D

Development of Standards for Production of Immersive 360 Motion Graphics, Based on 360 Monoscopic Videos: Layers of Information and Development of Content

Analysis of the Relationship Between Content and Interaction in the Usability Design of 360o Videos

1 Introduction

Cameras are affordable these days due to the technology advancements, which leads to a significant utilization of cameras by the users for capturing precious moments in their life. Omni-direction cameras can capture the whole scene using more than one camera and the images captured by these cameras are stitched together to give a 360$^\circ $ view of the scene.

Figure 1 portrays the basic workflow of 360$^\circ $ video. It generally commences with an omni-direction camera capturing 360$^\circ $ frames. Those are organized (i.e., stitched) together and sent to the encoding phase where the spherical video is projected to a 2D plane followed by frame packing and compression. The commonly used two projection formats: Equirectangular Projection (ERP) and Cubemap Projection (CMP) of a user-generated 360$^\circ $ video are shown in Figs. 2 and 3, respectively. The encoding phase is followed by the decoding phase where a single video undergoes interactive projection that offers the rendering process inter-relating with the respective input/output technology (such as HMD) at the consumer end.

Figure 4 depicts the different FoVs in traditional viewing mode extracted from the equirectangular projection given in Fig. 2. This gives the content creators flexibility to shoot in 360$^\circ $ and later in the post-processing they can select the FoV that matters the most.

This review article

is the first review on the user-generated 360$^\circ $ video to the best of our knowledge.
introduces various research areas in the user-generated 360$^\circ $ video.
investigates recent literature and categorizes based on research areas.
highlights the pros and cons of each methodology.

The article is organized as follows. Section 2 briefs various research trends in 360$^\circ $ video production, communication, and analysis. The processing techniques applied on 360$^\circ $ videos are discussed in Sect. 2.1. Section 2.2 discusses steaming techniques. Video post production methodologies are discussed in Sect. 2.3. The evaluation of the quality of 360$^\circ $ videos are reviewd in Sect. 2.4. Observations are listed in Sect. 3 and Sect. 4 concludes this article.

2 Research Trends in 360$^\circ $ Video

A brief survey of each research area in a 360$^\circ $ video is discussed in this section. Figure 5 depicts the research trends in 360$^\circ $ video.

2.1 Processing of 360$^\circ $ Video

This section discusses various processing techniques required for 360$^\circ $ videos before transmitting or storing. After capturing a 360$^\circ $ video, they need to be stitched and projected into a suitable representation, and then it will be compressed for transmission or storage. The following subsections present a review of the existing methods in processing 360$^\circ $ video.

2.1.1 Projection

In Sphere Segmented Projection, the visual artifact is caused due to inactive region [2]. In order to enhance coding efficiencies and to minimize visual artifacts, Yoon et al. suggest a scheme of padding inactive region. For panoramic videos, Huang et al. presented a low-complexity prototype scheme and video stitching mechanism [3]. Hanhart et al. recommended a coded approach on the basis of spherical neighboring relationship and projection form adaptation [4]. Su and Grauman proposed a spherical convolutional network used to process 360$^\circ $ imagery straightforward in its equirectangular projection, which is translated from a planar Convolutional Neural Network (CNN) [5]. Lin et al. propose a hybrid equiangular cubemap projection that minimizes seam artifacts [6]. Some characteristic equirectangular projection forms of sequences in the clip are experimented by Wang et al. [7].

It is unfavorable to attain a well-organized compression for storing and transmitting [8]. Hence, Vishwanath et al. recommended a rotational model for identifying the angular motion on the sphere effectively. In 3D space, for an angle $\alpha $, vector A is rotated around an axis given by a unit vector B. The coordinates of vectors A and B are (p, q, r) and (l, m, n), respectively. The coordinates of the rotated vector $A^{'}$ will be

$$\begin{aligned} p^{'}=l(B\cdot A)(1-\cos \alpha )+p\cos \alpha +(-nq+Ar)sin\alpha \end{aligned}$$

(1)

$$\begin{aligned} q^{'}=m(B\cdot A)(1-\cos \alpha )+q\cos \alpha +(np-lr)sin\alpha \end{aligned}$$

(2)

$$\begin{aligned} r^{'}=n(B\cdot A)(1-\cos \alpha )+r\cos \alpha +(-Ap+lq)sin\alpha \end{aligned}$$

(3)

where $B\cdot A$ is the dot product. Rotation of axis B is the vector right angled to the plane well defined through the origin, vector A, and also rotated vector $A^{'}$. Vector B is computed as follows:

$$\begin{aligned} B=\frac{A\times A^{'}}{| A\times A^{'}|} \end{aligned}$$

(4)

Angle of rotation is given as

$$\begin{aligned} \alpha =\cos ^{-1}(A.A^{'}) \end{aligned}$$

(5)

The summary of techniques, highlights, and challenges of 360$^\circ $ video projections is listed in Table 1.

Table 1 Summary on projection of 360$^\circ $ video

360\(^\circ \) User-Generated Videos: Current Research and Future Trends

Abstract

Similar content being viewed by others

Augmented ODV: Web-Driven Annotation and Interactivity Enhancement of 360 Degree Video in Both 2D and 3D

Development of Standards for Production of Immersive 360 Motion Graphics, Based on 360 Monoscopic Videos: Layers of Information and Development of Content

Analysis of the Relationship Between Content and Interaction in the Usability Design of 360o Videos

1 Introduction

2 Research Trends in 360\(^\circ \) Video

2.1 Processing of 360\(^\circ \) Video

2.1.1 Projection

2.1.2 Distortion

2.1.3 Compression

2.2 Streaming of 360\(^\circ \) Video

2.2.1 FoV-Based Streaming

2.2.2 Tile-Based Streaming

2.3 Post-production of 360\(^\circ \) Video

2.3.1 Visualization

2.3.2 Viewport Prediction

2.3.3 Designing Interface

2.3.4 User Experience

2.3.5 Cybersickness

2.3.6 Summarization

2.3.7 Subtitle

2.4 Quality Evaluation of 360\(^\circ \) Video

2.4.1 Standardization

2.4.2 Stabilization

2.4.3 Assessment

3 Observation

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation