Abstract
Traditional image resizing methods usually work in pixel space and use various saliency measures. The challenge is to adjust the image shape while trying to preserve important content. In this paper we perform image resizing in feature space using the deep layers of a neural network containing rich important semantic information. We directly adjust the image feature maps, extracted from a pre-trained classification network, and reconstruct the resized image using neural-network based optimization. This novel approach leverages the hierarchical encoding of the network, and in particular, the high-level discriminative power of its deeper layers, that can recognize semantic regions and objects, thereby allowing maintenance of their aspect ratios. Our use of reconstruction from deep features results in less noticeable artifacts than use of imagespace resizing operators. We evaluate our method on benchmarks, compare it to alternative approaches, and demonstrate its strengths on challenging images.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Avidan, S.; Shamir, A. Seam carving for content-aware image resizing. ACM Transactions on Graphics Vol. 26, No. 3, 10–es, 2007.
Wolf, L.; Guttmann, M.; Cohen-Or, D. Non-homogeneous content-driven video-retargeting. In: Proceedings of the IEEE 11th International Conference on Computer Vision, 1–6, 2007.
Rubinstein, M.; Shamir, A.; Avidan, S. Improved seam carving for video retargeting. ACM Transactions on Graphics Vol. 27, No. 3, Article No. 16, 2008.
Wang, Y. S.; Tai, C. L.; Sorkine, O.; Lee, T. Y. Optimized scale-and-stretch for image resizing. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 118, 2008.
Pritch, Y.; Kav-Venaki, E.; Peleg, S. Shift-map image editing. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 151–158, 2009.
Guo, Y. W.; Liu, F.; Shi, J.; Zhou, Z. H.; Gleicher, M. Image retargeting using mesh parametrization. IEEE Transactions on Multimedia Vol. 11, No. 5, 856–867, 2009.
Krähenbühl, P.; Lang, M.; Hornung, A.; Gross, M. A system for retargeting of streaming video. ACM Transactions on Graphics Vol. 28, No. 5, https://doi.org/10.1145/1618452.1618472, 2009.
Rubinstein, M.; Shamir, A.; Avidan, S. Multi-operator media retargeting. In: Proceedings of the ACM SIGGRAPH Papers, Article No. 23, 2009.
Wu, H. S.; Wang, Y. S.; Feng, K. C.; Wong, T. T.; Lee, T. Y.; Heng, P. A. Resizing by symmetry-summarization. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 159, 2010.
Panozzo, D.; Weber, O.; Sorkine, O. Robust image retargeting via axis-aligned deformation. Computer Graphics Forum Vol. 31, No. 2pt1, 229–236, 2012.
Cho, D.; Park, J.; Oh, T. H.; Tai, Y. W.; Kweon, I. S. Weakly- and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE International Conference on Computer Vision, 4568–4577, 2017.
Shocher, A.; Bagon, S.; Isola, P.; Irani, M. InGAN: Capturing and retargeting the “DNA” of a natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4491–4500, 2019.
Shaham, T. R.; Dekel, T.; Michaeli, T. SinGAN: Learning a generative model from a single natural image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4569–4579, 2019.
Blau, Y.; Michaeli, T. The perception-distortion tradeoff. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6228–6237, 2018.
Kiess, J.; Kopf, S.; Guthier, B.; Effelsberg, W. A survey on content-aware image and video retargeting. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 14, No. 3, Article No. 76, 2018.
Vaquero, D.; Turk, M.; Pulli, K.; Tico, M.; Gelfand, N. A survey of image retargeting techniques. In: Proceedings of the Applications of Digital Image Processing XXXIII, Vol. 7798, 328–342, 2010.
Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. Communications of the ACM Vol. 60, No. 6, 84–90, 2017.
Simonyan K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Szegedy, C.; Liu, W.; Jia, Y. Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D. Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580–587, 2014.
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision — ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 346–361, 2014.
Ren, S. Q.; He, K. M.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137–1149, 2017.
Liu, Z. G.; Wang, Z. P.; Zhang, L. M.; Shah, R. R.; Xia, Y. J.; Yang, Y.; Li, X. FastShrinkage: Perceptually-aware retargeting toward mobile platforms. In: Proceedings of the 25th ACM International Conference on Multimedia, 501–509, 2017.
Song, Y.; Tang, F.; Dong, W. M.; Zhang, X. P.; Deussen, O.; Lee, T. Y. Photo squarization by deep multi-operator retargeting. In: Proceedings of the 26th ACM international Conference on Multimedia, 1047–1055, 2018.
Kajiura, N.; Kosugi, S.; Wang, X. T.; Yamasaki, T. Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM International Conference on Multimedia, 1755–1763, 2020.
Esmaeili, S. A.; Singh, B.; Davis, L. S. Fast-at: Fast automatic thumbnail generation using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4178–4186, 2017.
Lin, J. X.; Zhou, T. K.; Chen, Z. B. DeepIR: A deep semantics driven framework for image retargeting. In: Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 54–59, 2019.
Liao, J.; Yao, Y.; Yuan, L.; Hua, G.; Kang, S. B. Visual attribute transfer through deep image analogy. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 120, 2017.
Barnes, C.; Shechtman, E.; Finkelstein, A.; Goldman, D. B. PatchMatch: A randomized correspondence algorithm for structural image editing. In: Proceedings of the ACM SIGGRAPH 2009 Papers, Article No. 24, 2009.
Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, 2017–2025, 2015.
Gatys, L. A.; Ecker, A. S.; Bethge, M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2414–2423, 2016.
Kingma, D. P.; Ba, J. Adam: A method forstochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Odena, A.; Dumoulin, V.; Olah, C. Deconvolution and checkerboard artifacts. Distill, 2016. Available at https://distill.pub/2016/deconv-checkerboard/.
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211–252, 2015.
Everingham, M.; van Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The PASCAL visual object classes challenge 2007 (VOC2007) results. 2007. Available at http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html.
Rubinstein, M.; Gutierrez, D.; Sorkine, O.; Shamir, A. A comparative study of image retargeting. In: Proceedings of the ACM SIGGRAPH Asia Papers, Article No. 160, 2010.
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision — ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
Ulyanov, D.; Lebedev, V.; Vedaldi, A.; Lempitsky, V. Texture networks: Feed-forward synthesis of textures and stylized images. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, Vol. 48, 1349–1357, 2016.
Ulyanov, D.; Vedaldi, A.; Lempitsky, V. S. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
Author information
Authors and Affiliations
Corresponding author
Additional information
Dov Danon is a Ph.D. student at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (summa cum laude) degree in computer science and mathematics from the Ben Gurion of the Negev in 2007 and his M.Sc. degree in computer science from Tel-Aviv University in 2016. His research interests include machine learning and, in particular, unsupervised learning in image processing.
Moab Arar is a Ph.D. candidate at the School of Computer Science, Tel-Aviv University. He received his B.Sc. degree in computer engineering from the Technion Israel Institute of Technology in 2015, and his M.Sc. degree in computer science from Tel-Aviv University in 2019. His research interests span computer graphics and computer vision, with a particular focus on deep learning and machine learning methodologies for vision and rendering tasks.
Daniel Cohen-Or is a professor at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (cum laude) degree in mathematics and computer science and his M.Sc. (cum laude) degree in computer science, both from Ben-Gurion University, in 1985 and 1986, respectively. He received his Ph.D. from the Department of Computer Science at the State University of New York at Stony Brook in 1991. He received the 2005 Eurographics Outstanding Technical Contributions Award. In 2015, he was named a Thomson Reuters Highly Cited Researcher. Currently, his main interests are in image synthesis, analysis and reconstruction, motion and transformations, shapes and surfaces.
Ariel Shamir is the Dean of the Efi Arazi School of Computer Science at the Interdisciplinary Center in Israel. He received his Ph.D. degree in computer science in 2000 from the Hebrew University in Jerusalem, and spent two years as a postdoctoral researcher in the computational visualisation centre at the University of Texas in Austin. He is currently an associate editor for ACM Transactions on Graphics, Graphical Models, and Computational Visual Media, and was an associate editor for Computers and Graphics journal (2010–2014), and IEEE Transactions on Visualization and Computer Graphics (2015–2017). He has also served on the program committee of many leading international conferences, including SIGGRAPH, SIGGRAPH Asia, and Eurographics. Prof. Shamir was named one of the most highly cited researchers on the Thomson Reuters list in 2015. He has a broad commercial experience of consulting for various companies including Disney Research, Mitsubishi Electric, PrimeSense (now Apple), and Verisk. He specializes in geometric modeling, computer graphics, image processing, and machine learning.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Danon, D., Arar, M., Cohen-Or, D. et al. Image resizing by reconstruction from deep features. Comp. Visual Media 7, 453–466 (2021). https://doi.org/10.1007/s41095-021-0216-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-021-0216-x