Abstract
A popular and challenging task in video research, frame interpolation aims to increase the frame rate of video. Most existing methods employ a fixed motion model, e.g., linear, quadratic, or cubic, to estimate the intermediate warping field. However, such fixed motion models cannot well represent the complicated non-linear motions in the real world or rendered animations. Instead, we present an adaptive flow prediction module to better approximate the complex motions in video. Furthermore, interpolating just one intermediate frame between consecutive input frames may be insufficient for complicated non-linear motions. To enable multi-frame interpolation, we introduce the time as a control variable when interpolating frames between original ones in our generic adaptive flow prediction module. Qualitative and quantitative experimental results show that our method can produce high-quality results and outperforms the existing state-of-the-art methods on popular public datasets.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Lu, G.; Zhang, X. Y.; Chen, L.; Gao, Z. Y. Novel integration of frame rate up conversion and HEVC coding based on rate-distortion optimization. IEEE Transactions on Image Processing Vol. 27, No. 2, 678–691, 2018.
Wu, C.-Y.; Singhal, N.; Krähenbühl, P. Video compression through image interpolation. In: Computer Vision — ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 425–440, 2018.
Karargyris, A.; Bourbakis, N. Three-dimensional reconstruction of the digestive wall in capsule endoscopy videos using elastic video interpolation. IEEE Transactions on Medical Imaging Vol. 30, No. 4, 957–971, 2011.
Flynn, J.; Neulander, I.; Philbin, J.; Snavely, N. Deep stereo: Learning to predict new views from the world’s imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5515–5524, 2016.
Bao, W. B.; Lai, W. S.; Zhang, X. Y.; Gao, Z. Y.; Yang, M. H. MEMC-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 3, 933–948, 2021.
Bao, W. B.; Lai, W. S.; Ma, C.; Zhang, X. Y.; Gao, Z. Y.; Yang, M. H. Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3698–3707, 2019.
Cheng, X. H.; Chen, Z. Z. Video frame interpolation via deformable separable convolution. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 7, 10607–10614, 2020.
Chi, Z. X.; Mohammadi Nasiri, R.; Liu, Z.; Lu, J. W.; Tang, J.; Plataniotis, K. N. All at once: Temporally adaptive multi-frame interpolation with advanced motion modeling. In: Computer Vision — ECCV 2020. Lecture Notes in Computer Science, Vol. 12372. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 107–123, 2020.
Lee, H.; Kim, T.; Chung, T. Y.; Pak, D.; Ban, Y.; Lee, S. AdaCoF: Adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5315–5324, 2020.
Meyer, S.; Djelouah, A.; McWilliams, B.; Sorkine-Hornung, A.; Gross, M.; Schroers, C. PhaseNet for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 498–507, 2018.
Niklaus, S.; Liu, F. Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1701–1710, 2018.
Niklaus, S.; Liu, F. Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5436–5445, 2020.
Niklaus, S.; Mai, L.; Liu, F. Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2270–2279, 2017.
Xu, X.; Siyao, L.; Sun, W.; Yin, Q.; Yang, M.-H. Quadratic video interpolation. In: Proceedings of the Advances in Neural Information Processing Systems, 2019.
Xue, T. F.; Chen, B. A.; Wu, J. J.; Wei, D. L.; Freeman, W. T. Video enhancement with task-oriented flow. International Journal of Computer Vision Vol. 127, No. 8, 1106–1125, 2019.
Jiang, H.; Sun, D.; Jampani, V.; Yang, M.-H.; Learned-Miller, E.; Kautz, J. Super SloMo: High quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9000–9008, 2018.
Liu, Z. W.; Yeh, R. A.; Tang, X. O.; Liu, Y. M.; Agarwala, A. Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, 4473–4481, 2017.
Liu, Y.-L.; Liao, Y.-T.; Lin, yen-yu; Chuang, Y.-Y. Deep video frame interpolation using cyclic frame generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 8794–8802, 2019.
Dosovitskiy, A.; Fischer, P.; Ilg, E.; Häusser, P.; Hazirbas, C.; Golkov, V.; van Der Smagt, P.; Cremers, D.; Brox, T. FlowNet: Learning optical flow with convolutional networks. In: Proceeedings of the IEEE International Conference on Computer Vision, 2758–2766, 2015.
Ranjan, A.; Black, M. J. Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2720–2729, 2017.
Sun, D.; Yang, X.; Liu, M.-Y.; Kautz, J. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8934–8943, 2018.
Yuan, L. Z.; Chen, Y. B.; Liu, H. T.; Kong, T.; Shi, J. B. Zoom-in-to-check: Boosting video interpolation via instance-level discrimination. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12175–12183, 2019.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Park, J.; Ko, K.; Lee, C.; Kim, C. S. BMBC: Bilateral motion estimation with bilateral cost volume for video interpolation. In: Computer Vision — ECCV 2020. Lecture Notes in Computer Science, Vol. 12359. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 109–125, 2020.
Long, G. C.; Kneip, L.; Alvarez, J. M.; Li, H. D.; Zhang, X. H.; Yu, Q. F. Learning image matching by simply watching video. In: Computer Vision — ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 434–450, 2016.
Niklaus, S.; Mai, L.; Liu, F. Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, 261–270, 2017.
Dai, J. F.; Qi, H. Z.; Xiong, Y. W.; Li, Y.; Zhang, G. D.; Hu, H.; Wei, Y. C. Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, 764–773, 2017.
Zhu, X. Z.; Hu, H.; Lin, S.; Dai, J. F. Deformable ConvNets V2: More deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9300–9308, 2019.
Meyer, S.; Wang, O.; Zimmer, H.; Grosse, M.; Sorkine-Hornung, A. Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1410–1418, 2015.
Gui, S. R.; Wang, C. Y.; Chen, Q. H.; Tao, D. C. FeatureFlow: Robust video interpolation via structure-to-texture generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14001–14010, 2020.
Choi, M.; Kim, H.; Han, B.; Xu, N.; Lee, K. M. Channel attention is all you need for video frame interpolation. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 7, 10663–10671, 2020.
Maas, A. L.; Hannun, A. Y.; Ng, A. Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning, 2013.
Fourure, D.; Emonet, R.; Fromont, E.; Muselet, D.; Tremeau, A.; Wolf, C. Residual conv-deconv grid network for semantic segmentation. In: Proceedings of the British Machine Vision Conference, 181.1–181.13, 2017.
Charbonnier, P.; Blanc-Feraud, L.; Aubert, G.; Barlaud, M. Two deterministic half-quadratic regularization algorithms for computed imaging. In: Proceedings of the International Conference on Image Processing, 168–172, 1994.
Dosovitskiy, A.; Brox, T. Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th Conference on Neural Information Processing Systems, 2016.
Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision — ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.
Nah, S.; Kim, T. H.; Lee, K. M. Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 257–265, 2017.
Su, S. C.; Delbracio, M.; Wang, J.; Sapiro, G.; Heidrich, W.; Wang, O. Deep video deblurring for hand-held cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 237–246, 2017.
Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015.
Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; van Gool, L.; Gross, M.; Sorkine-Hornung, A. A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 724–732, 2016.
Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.
Baker, S.; Scharstein, D.; Lewis, J. P.; Roth, S.; Black, M. J.; Szeliski, R. A database and evaluation methodology for optical flow. International Journal Computer Vision Vol. 92, No. 1, 1–31, 2011.
Acknowledgements
This project was supported by the Research Grants Council of the Hong Kong Special Administrative Region, under RGC General Research Fund (Project No. CUHK 14201017), Shenzhen Science and Technology Program (No. JCYJ20180507182410327), and the Science and Technology Plan Project of Guangzhou (No. 201704020141).
Author information
Authors and Affiliations
Corresponding author
Additional information
Jinbo Xing received his B.Sc. degree in computer science from the Chinese University of Hong Kong in 2020. He is currently an M.Sc. student in the Department of Computer Science and Engineering, the Chinese University of Hong Kong. His research interests include computer vision and computer graphics.
Wenbo Hu is currently a Ph.D. student in the Department of Computer Science and Engineering, the Chinese University of Hong Kong. He received his B.Sc. degree in computer science and technology from Dalian University of Technology, China, in 2018. His research interests include computer vision, computer graphics, and deep learning.
Yuechen Zhang is currently a final-year undergraduate student majoring in computer science at the Chinese University of Hong Kong. His research interests include semantic segmentation, video frame interpolation, and neural style transfer.
Tien-Tsin Wong received his B.Sc., M.Phil., and Ph.D. degrees in computer science from the Chinese University of Hong Kong in 1992, 1994, and 1998, respectively, where he is currently a professor in the Department of Computer Science and Engineering. His main research interests include computer graphics, computational manga, precomputed lighting, image based rendering, GPU techniques, medical visualization, multimedia compression, and computer vision. He received an IEEE Transactions on Multimedia Prize Paper Award 2005 and a Young Researcher Award 2004.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorial-manager.com/cvmj.
About this article
Cite this article
Xing, J., Hu, W., Zhang, Y. et al. Flow-aware synthesis: A generic motion model for video frame interpolation. Comp. Visual Media 7, 393–405 (2021). https://doi.org/10.1007/s41095-021-0208-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-021-0208-x