Flow-aware synthesis: A generic motion model for video frame interpolation

Xing, Jinbo; Hu, Wenbo; Zhang, Yuechen; Wong, Tien-Tsin

doi:10.1007/s41095-021-0208-x

Flow-aware synthesis: A generic motion model for video frame interpolation

Research Article
Open access
Published: 17 March 2021

Volume 7, pages 393–405, (2021)
Cite this article

Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope Submit manuscript

Flow-aware synthesis: A generic motion model for video frame interpolation

Download PDF

Jinbo Xing^1,2^na1,
Wenbo Hu^1,2^na1,
Yuechen Zhang¹ &
…
Tien-Tsin Wong^1,2

769 Accesses
5 Citations
20 Altmetric
2 Mentions
Explore all metrics

Abstract

A popular and challenging task in video research, frame interpolation aims to increase the frame rate of video. Most existing methods employ a fixed motion model, e.g., linear, quadratic, or cubic, to estimate the intermediate warping field. However, such fixed motion models cannot well represent the complicated non-linear motions in the real world or rendered animations. Instead, we present an adaptive flow prediction module to better approximate the complex motions in video. Furthermore, interpolating just one intermediate frame between consecutive input frames may be insufficient for complicated non-linear motions. To enable multi-frame interpolation, we introduce the time as a control variable when interpolating frames between original ones in our generic adaptive flow prediction module. Qualitative and quantitative experimental results show that our method can produce high-quality results and outperforms the existing state-of-the-art methods on popular public datasets.

Article PDF

Hybrid Warping Fusion for Video Frame Interpolation

Article 23 September 2022

MVFI-Net: Motion-Aware Video Frame Interpolation Network

A Multi-frame Video Interpolation Neural Network for Large Motion

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Lu, G.; Zhang, X. Y.; Chen, L.; Gao, Z. Y. Novel integration of frame rate up conversion and HEVC coding based on rate-distortion optimization. IEEE Transactions on Image Processing Vol. 27, No. 2, 678–691, 2018.
Article MathSciNet Google Scholar
Wu, C.-Y.; Singhal, N.; Krähenbühl, P. Video compression through image interpolation. In: Computer Vision — ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 425–440, 2018.
Chapter Google Scholar
Karargyris, A.; Bourbakis, N. Three-dimensional reconstruction of the digestive wall in capsule endoscopy videos using elastic video interpolation. IEEE Transactions on Medical Imaging Vol. 30, No. 4, 957–971, 2011.
Article Google Scholar
Flynn, J.; Neulander, I.; Philbin, J.; Snavely, N. Deep stereo: Learning to predict new views from the world’s imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5515–5524, 2016.
Bao, W. B.; Lai, W. S.; Zhang, X. Y.; Gao, Z. Y.; Yang, M. H. MEMC-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 3, 933–948, 2021.
Article Google Scholar
Bao, W. B.; Lai, W. S.; Ma, C.; Zhang, X. Y.; Gao, Z. Y.; Yang, M. H. Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3698–3707, 2019.
Cheng, X. H.; Chen, Z. Z. Video frame interpolation via deformable separable convolution. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 7, 10607–10614, 2020.
Article Google Scholar
Chi, Z. X.; Mohammadi Nasiri, R.; Liu, Z.; Lu, J. W.; Tang, J.; Plataniotis, K. N. All at once: Temporally adaptive multi-frame interpolation with advanced motion modeling. In: Computer Vision — ECCV 2020. Lecture Notes in Computer Science, Vol. 12372. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 107–123, 2020.
Chapter Google Scholar
Lee, H.; Kim, T.; Chung, T. Y.; Pak, D.; Ban, Y.; Lee, S. AdaCoF: Adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5315–5324, 2020.
Meyer, S.; Djelouah, A.; McWilliams, B.; Sorkine-Hornung, A.; Gross, M.; Schroers, C. PhaseNet for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 498–507, 2018.
Niklaus, S.; Liu, F. Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1701–1710, 2018.
Niklaus, S.; Liu, F. Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5436–5445, 2020.
Niklaus, S.; Mai, L.; Liu, F. Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2270–2279, 2017.
Xu, X.; Siyao, L.; Sun, W.; Yin, Q.; Yang, M.-H. Quadratic video interpolation. In: Proceedings of the Advances in Neural Information Processing Systems, 2019.
Xue, T. F.; Chen, B. A.; Wu, J. J.; Wei, D. L.; Freeman, W. T. Video enhancement with task-oriented flow. International Journal of Computer Vision Vol. 127, No. 8, 1106–1125, 2019.
Article Google Scholar
Jiang, H.; Sun, D.; Jampani, V.; Yang, M.-H.; Learned-Miller, E.; Kautz, J. Super SloMo: High quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9000–9008, 2018.
Liu, Z. W.; Yeh, R. A.; Tang, X. O.; Liu, Y. M.; Agarwala, A. Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, 4473–4481, 2017.
Liu, Y.-L.; Liao, Y.-T.; Lin, yen-yu; Chuang, Y.-Y. Deep video frame interpolation using cyclic frame generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 8794–8802, 2019.
Article Google Scholar
Dosovitskiy, A.; Fischer, P.; Ilg, E.; Häusser, P.; Hazirbas, C.; Golkov, V.; van Der Smagt, P.; Cremers, D.; Brox, T. FlowNet: Learning optical flow with convolutional networks. In: Proceeedings of the IEEE International Conference on Computer Vision, 2758–2766, 2015.
Ranjan, A.; Black, M. J. Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2720–2729, 2017.
Sun, D.; Yang, X.; Liu, M.-Y.; Kautz, J. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8934–8943, 2018.
Yuan, L. Z.; Chen, Y. B.; Liu, H. T.; Kong, T.; Shi, J. B. Zoom-in-to-check: Boosting video interpolation via instance-level discrimination. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12175–12183, 2019.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Park, J.; Ko, K.; Lee, C.; Kim, C. S. BMBC: Bilateral motion estimation with bilateral cost volume for video interpolation. In: Computer Vision — ECCV 2020. Lecture Notes in Computer Science, Vol. 12359. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 109–125, 2020.
Chapter Google Scholar
Long, G. C.; Kneip, L.; Alvarez, J. M.; Li, H. D.; Zhang, X. H.; Yu, Q. F. Learning image matching by simply watching video. In: Computer Vision — ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 434–450, 2016.
Google Scholar
Niklaus, S.; Mai, L.; Liu, F. Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, 261–270, 2017.
Dai, J. F.; Qi, H. Z.; Xiong, Y. W.; Li, Y.; Zhang, G. D.; Hu, H.; Wei, Y. C. Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, 764–773, 2017.
Zhu, X. Z.; Hu, H.; Lin, S.; Dai, J. F. Deformable ConvNets V2: More deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9300–9308, 2019.
Meyer, S.; Wang, O.; Zimmer, H.; Grosse, M.; Sorkine-Hornung, A. Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1410–1418, 2015.
Gui, S. R.; Wang, C. Y.; Chen, Q. H.; Tao, D. C. FeatureFlow: Robust video interpolation via structure-to-texture generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14001–14010, 2020.
Choi, M.; Kim, H.; Han, B.; Xu, N.; Lee, K. M. Channel attention is all you need for video frame interpolation. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 7, 10663–10671, 2020.
Article Google Scholar
Maas, A. L.; Hannun, A. Y.; Ng, A. Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning, 2013.
Fourure, D.; Emonet, R.; Fromont, E.; Muselet, D.; Tremeau, A.; Wolf, C. Residual conv-deconv grid network for semantic segmentation. In: Proceedings of the British Machine Vision Conference, 181.1–181.13, 2017.
Charbonnier, P.; Blanc-Feraud, L.; Aubert, G.; Barlaud, M. Two deterministic half-quadratic regularization algorithms for computed imaging. In: Proceedings of the International Conference on Image Processing, 168–172, 1994.
Dosovitskiy, A.; Brox, T. Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th Conference on Neural Information Processing Systems, 2016.
Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision — ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.
Google Scholar
Nah, S.; Kim, T. H.; Lee, K. M. Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 257–265, 2017.
Su, S. C.; Delbracio, M.; Wang, J.; Sapiro, G.; Heidrich, W.; Wang, O. Deep video deblurring for hand-held cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 237–246, 2017.
Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015.
Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; van Gool, L.; Gross, M.; Sorkine-Hornung, A. A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 724–732, 2016.
Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.
Article Google Scholar
Baker, S.; Scharstein, D.; Lewis, J. P.; Roth, S.; Black, M. J.; Szeliski, R. A database and evaluation methodology for optical flow. International Journal Computer Vision Vol. 92, No. 1, 1–31, 2011.
Article Google Scholar

Download references

Acknowledgements

This project was supported by the Research Grants Council of the Hong Kong Special Administrative Region, under RGC General Research Fund (Project No. CUHK 14201017), Shenzhen Science and Technology Program (No. JCYJ20180507182410327), and the Science and Technology Plan Project of Guangzhou (No. 201704020141).

Author information

Jinbo Xing and Wenbo Hu contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China
Jinbo Xing, Wenbo Hu, Yuechen Zhang & Tien-Tsin Wong
Shenzhen Key Laboratory of Virtual Reality and Human Interaction Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, China
Jinbo Xing, Wenbo Hu & Tien-Tsin Wong

Authors

Jinbo Xing
View author publications
You can also search for this author in PubMed Google Scholar
Wenbo Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuechen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tien-Tsin Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tien-Tsin Wong.

Additional information

Jinbo Xing received his B.Sc. degree in computer science from the Chinese University of Hong Kong in 2020. He is currently an M.Sc. student in the Department of Computer Science and Engineering, the Chinese University of Hong Kong. His research interests include computer vision and computer graphics.

Wenbo Hu is currently a Ph.D. student in the Department of Computer Science and Engineering, the Chinese University of Hong Kong. He received his B.Sc. degree in computer science and technology from Dalian University of Technology, China, in 2018. His research interests include computer vision, computer graphics, and deep learning.

Yuechen Zhang is currently a final-year undergraduate student majoring in computer science at the Chinese University of Hong Kong. His research interests include semantic segmentation, video frame interpolation, and neural style transfer.

Tien-Tsin Wong received his B.Sc., M.Phil., and Ph.D. degrees in computer science from the Chinese University of Hong Kong in 1992, 1994, and 1998, respectively, where he is currently a professor in the Department of Computer Science and Engineering. His main research interests include computer graphics, computational manga, precomputed lighting, image based rendering, GPU techniques, medical visualization, multimedia compression, and computer vision. He received an IEEE Transactions on Multimedia Prize Paper Award 2005 and a Young Researcher Award 2004.

Electronic supplementary material

Supplementary material, approximately 28.5 MB.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorial-manager.com/cvmj.

Reprints and permissions

About this article

Cite this article

Xing, J., Hu, W., Zhang, Y. et al. Flow-aware synthesis: A generic motion model for video frame interpolation. Comp. Visual Media 7, 393–405 (2021). https://doi.org/10.1007/s41095-021-0208-x

Download citation

Received: 31 December 2020
Accepted: 27 January 2021
Published: 17 March 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s41095-021-0208-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Flow-aware synthesis: A generic motion model for video frame interpolation

Abstract

Article PDF

Similar content being viewed by others

Hybrid Warping Fusion for Video Frame Interpolation

MVFI-Net: Motion-Aware Video Frame Interpolation Network

A Multi-frame Video Interpolation Neural Network for Large Motion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 28.5 MB.

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Flow-aware synthesis: A generic motion model for video frame interpolation

Abstract

Article PDF

Similar content being viewed by others

Hybrid Warping Fusion for Video Frame Interpolation

MVFI-Net: Motion-Aware Video Frame Interpolation Network

A Multi-frame Video Interpolation Neural Network for Large Motion

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 28.5 MB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation