Image resizing by reconstruction from deep features

Danon, Dov; Arar, Moab; Cohen-Or, Daniel; Shamir, Ariel

doi:10.1007/s41095-021-0216-x

Image resizing by reconstruction from deep features

Research Article
Open access
Published: 27 April 2021

Volume 7, pages 453–466, (2021)
Cite this article

Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope Submit manuscript

Image resizing by reconstruction from deep features

Download PDF

Dov Danon¹^na1,
Moab Arar¹^na1,
Daniel Cohen-Or¹ &
…
Ariel Shamir²

1395 Accesses
15 Citations
Explore all metrics

Abstract

Traditional image resizing methods usually work in pixel space and use various saliency measures. The challenge is to adjust the image shape while trying to preserve important content. In this paper we perform image resizing in feature space using the deep layers of a neural network containing rich important semantic information. We directly adjust the image feature maps, extracted from a pre-trained classification network, and reconstruct the resized image using neural-network based optimization. This novel approach leverages the hierarchical encoding of the network, and in particular, the high-level discriminative power of its deeper layers, that can recognize semantic regions and objects, thereby allowing maintenance of their aspect ratios. Our use of reconstruction from deep features results in less noticeable artifacts than use of imagespace resizing operators. We evaluate our method on benchmarks, compare it to alternative approaches, and demonstrate its strengths on challenging images.

Article PDF

Optimized RainDNet: an efficient image deraining method with enhanced perceptual quality

Article 27 June 2024

Pyramid Attention Network for Image Restoration

Article Open access 08 August 2023

DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Avidan, S.; Shamir, A. Seam carving for content-aware image resizing. ACM Transactions on Graphics Vol. 26, No. 3, 10–es, 2007.
Article Google Scholar
Wolf, L.; Guttmann, M.; Cohen-Or, D. Non-homogeneous content-driven video-retargeting. In: Proceedings of the IEEE 11th International Conference on Computer Vision, 1–6, 2007.
Rubinstein, M.; Shamir, A.; Avidan, S. Improved seam carving for video retargeting. ACM Transactions on Graphics Vol. 27, No. 3, Article No. 16, 2008.
Wang, Y. S.; Tai, C. L.; Sorkine, O.; Lee, T. Y. Optimized scale-and-stretch for image resizing. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 118, 2008.
Pritch, Y.; Kav-Venaki, E.; Peleg, S. Shift-map image editing. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 151–158, 2009.
Guo, Y. W.; Liu, F.; Shi, J.; Zhou, Z. H.; Gleicher, M. Image retargeting using mesh parametrization. IEEE Transactions on Multimedia Vol. 11, No. 5, 856–867, 2009.
Article Google Scholar
Krähenbühl, P.; Lang, M.; Hornung, A.; Gross, M. A system for retargeting of streaming video. ACM Transactions on Graphics Vol. 28, No. 5, https://doi.org/10.1145/1618452.1618472, 2009.
Rubinstein, M.; Shamir, A.; Avidan, S. Multi-operator media retargeting. In: Proceedings of the ACM SIGGRAPH Papers, Article No. 23, 2009.
Wu, H. S.; Wang, Y. S.; Feng, K. C.; Wong, T. T.; Lee, T. Y.; Heng, P. A. Resizing by symmetry-summarization. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 159, 2010.
Panozzo, D.; Weber, O.; Sorkine, O. Robust image retargeting via axis-aligned deformation. Computer Graphics Forum Vol. 31, No. 2pt1, 229–236, 2012.
Article Google Scholar
Cho, D.; Park, J.; Oh, T. H.; Tai, Y. W.; Kweon, I. S. Weakly- and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE International Conference on Computer Vision, 4568–4577, 2017.
Shocher, A.; Bagon, S.; Isola, P.; Irani, M. InGAN: Capturing and retargeting the “DNA” of a natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4491–4500, 2019.
Shaham, T. R.; Dekel, T.; Michaeli, T. SinGAN: Learning a generative model from a single natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4569–4579, 2019.
Blau, Y.; Michaeli, T. The perception-distortion tradeoff. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6228–6237, 2018.
Kiess, J.; Kopf, S.; Guthier, B.; Effelsberg, W. A survey on content-aware image and video retargeting. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 14, No. 3, Article No. 76, 2018.
Vaquero, D.; Turk, M.; Pulli, K.; Tico, M.; Gelfand, N. A survey of image retargeting techniques. In: Proceedings of the Applications of Digital Image Processing XXXIII, Vol. 7798, 328–342, 2010.
Google Scholar
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. Communications of the ACM Vol. 60, No. 6, 84–90, 2017.
Article Google Scholar
Simonyan K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Szegedy, C.; Liu, W.; Jia, Y. Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D. Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580–587, 2014.
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision — ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 346–361, 2014.
Chapter Google Scholar
Ren, S. Q.; He, K. M.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137–1149, 2017.
Article Google Scholar
Liu, Z. G.; Wang, Z. P.; Zhang, L. M.; Shah, R. R.; Xia, Y. J.; Yang, Y.; Li, X. FastShrinkage: Perceptually-aware retargeting toward mobile platforms. In: Proceedings of the 25th ACM International Conference on Multimedia, 501–509, 2017.
Song, Y.; Tang, F.; Dong, W. M.; Zhang, X. P.; Deussen, O.; Lee, T. Y. Photo squarization by deep multi-operator retargeting. In: Proceedings of the 26th ACM international Conference on Multimedia, 1047–1055, 2018.
Kajiura, N.; Kosugi, S.; Wang, X. T.; Yamasaki, T. Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM International Conference on Multimedia, 1755–1763, 2020.
Esmaeili, S. A.; Singh, B.; Davis, L. S. Fast-at: Fast automatic thumbnail generation using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4178–4186, 2017.
Lin, J. X.; Zhou, T. K.; Chen, Z. B. DeepIR: A deep semantics driven framework for image retargeting. In: Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 54–59, 2019.
Liao, J.; Yao, Y.; Yuan, L.; Hua, G.; Kang, S. B. Visual attribute transfer through deep image analogy. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 120, 2017.
Barnes, C.; Shechtman, E.; Finkelstein, A.; Goldman, D. B. PatchMatch: A randomized correspondence algorithm for structural image editing. In: Proceedings of the ACM SIGGRAPH 2009 Papers, Article No. 24, 2009.
Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, 2017–2025, 2015.
Google Scholar
Gatys, L. A.; Ecker, A. S.; Bethge, M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2414–2423, 2016.
Kingma, D. P.; Ba, J. Adam: A method forstochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Odena, A.; Dumoulin, V.; Olah, C. Deconvolution and checkerboard artifacts. Distill, 2016. Available at https://distill.pub/2016/deconv-checkerboard/.
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211–252, 2015.
Article MathSciNet Google Scholar
Everingham, M.; van Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The PASCAL visual object classes challenge 2007 (VOC2007) results. 2007. Available at http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html.
Rubinstein, M.; Gutierrez, D.; Sorkine, O.; Shamir, A. A comparative study of image retargeting. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 160, 2010.
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision — ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
Chapter Google Scholar
Ulyanov, D.; Lebedev, V.; Vedaldi, A.; Lempitsky, V. Texture networks: Feed-forward synthesis of textures and stylized images. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, Vol. 48, 1349–1357, 2016.
Google Scholar
Ulyanov, D.; Vedaldi, A.; Lempitsky, V. S. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.

Download references

Author information

Dov Danon and Moab Arar contribucted equally to this work.

Authors and Affiliations

Tel Aviv University, Tel Aviv, 69978, Israel
Dov Danon, Moab Arar & Daniel Cohen-Or
The Interdisciplinary Center Herzliya, Herzliya, 4610101, Israel
Ariel Shamir

Authors

Dov Danon
View author publications
You can also search for this author in PubMed Google Scholar
Moab Arar
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Cohen-Or
View author publications
You can also search for this author in PubMed Google Scholar
Ariel Shamir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dov Danon.

Additional information

Dov Danon is a Ph.D. student at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (summa cum laude) degree in computer science and mathematics from the Ben Gurion of the Negev in 2007 and his M.Sc. degree in computer science from Tel-Aviv University in 2016. His research interests include machine learning and, in particular, unsupervised learning in image processing.

Moab Arar is a Ph.D. candidate at the School of Computer Science, Tel-Aviv University. He received his B.Sc. degree in computer engineering from the Technion Israel Institute of Technology in 2015, and his M.Sc. degree in computer science from Tel-Aviv University in 2019. His research interests span computer graphics and computer vision, with a particular focus on deep learning and machine learning methodologies for vision and rendering tasks.

Daniel Cohen-Or is a professor at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (cum laude) degree in mathematics and computer science and his M.Sc. (cum laude) degree in computer science, both from Ben-Gurion University, in 1985 and 1986, respectively. He received his Ph.D. from the Department of Computer Science at the State University of New York at Stony Brook in 1991. He received the 2005 Eurographics Outstanding Technical Contributions Award. In 2015, he was named a Thomson Reuters Highly Cited Researcher. Currently, his main interests are in image synthesis, analysis and reconstruction, motion and transformations, shapes and surfaces.

Ariel Shamir is the Dean of the Efi Arazi School of Computer Science at the Interdisciplinary Center in Israel. He received his Ph.D. degree in computer science in 2000 from the Hebrew University in Jerusalem, and spent two years as a postdoctoral researcher in the computational visualisation centre at the University of Texas in Austin. He is currently an associate editor for ACM Transactions on Graphics, Graphical Models, and Computational Visual Media, and was an associate editor for Computers and Graphics journal (2010–2014), and IEEE Transactions on Visualization and Computer Graphics (2015–2017). He has also served on the program committee of many leading international conferences, including SIGGRAPH, SIGGRAPH Asia, and Eurographics. Prof. Shamir was named one of the most highly cited researchers on the Thomson Reuters list in 2015. He has a broad commercial experience of consulting for various companies including Disney Research, Mitsubishi Electric, PrimeSense (now Apple), and Verisk. He specializes in geometric modeling, computer graphics, image processing, and machine learning.

Electronic supplementary material

Supplementary material: Image resizing by reconstruction from deep features

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and permissions

About this article

Cite this article

Danon, D., Arar, M., Cohen-Or, D. et al. Image resizing by reconstruction from deep features. Comp. Visual Media 7, 453–466 (2021). https://doi.org/10.1007/s41095-021-0216-x

Download citation

Received: 28 January 2021
Accepted: 25 February 2021
Published: 27 April 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s41095-021-0216-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Image resizing by reconstruction from deep features

Abstract

Article PDF

Similar content being viewed by others

Optimized RainDNet: an efficient image deraining method with enhanced perceptual quality

Pyramid Attention Network for Image Restoration

DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material: Image resizing by reconstruction from deep features

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image resizing by reconstruction from deep features

Abstract

Article PDF

Similar content being viewed by others

Optimized RainDNet: an efficient image deraining method with enhanced perceptual quality

Pyramid Attention Network for Image Restoration

DIAR: Deep Image Alignment and Reconstruction Using Swin Transformers

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material: Image resizing by reconstruction from deep features

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation