Improving the 3D Perception of the Pepper Robot Using Depth Prediction from Monocular Frames

Bauer, Zuria; Escalona, Felix; Cruz, Edmanuel; Cazorla, Miguel; Gomez-Donoso, Francisco

doi:10.1007/978-3-319-99885-5_10

Zuria Bauer¹⁹,
Felix Escalona¹⁹,
Edmanuel Cruz¹⁹,
Miguel Cazorla¹⁹ &
…
Francisco Gomez-Donoso¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 855))

Included in the following conference series:

Workshop of Physical Agents

503 Accesses
4 Citations

Abstract

The robot Pepper provides a bad depth estimation. In this paper, we present a method for improving that 3D estimation. The method is based on using the RGB image to predict monocular depth. As it will be shown, the combination of both, monocular and 3D depth, provides a better 3D data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

3D Depth Perception from Single Monocular Images

Depth Estimation with Ego-Motion Assisted Monocular Camera

Article 01 July 2019

3D Environment Reconstruction Using Mobile Robot Platform and Monocular Vision

Notes

1.
http://doc.aldebaran.com/2-7/family/pepper_technical/video_3D_pep.html.

References

Cao, Y., Wu, Z., Shen, C.: Estimating depth from monocular images as classification using deep fully convolutional residual networks, May 2016
Google Scholar
Castanedo, F.: A review of data fusion techniques. Sci. World J. 2013, 19 (2013)
Article Google Scholar
Dosovitskiy, A., Springenberg, J.T., Tatarchenko, M., Brox, T.: Learning to generate chairs, tables and cars with convolutional networks. arXiv e-prints, November 2014
Google Scholar
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2014, pp. 2366–2374. MIT Press, Cambridge (2014). http://dl.acm.org/citation.cfm?id=2969033.2969091
Elmenreich, W.: An introduction to sensor fusion. Research Report 47/2001, Technische Universität Wien, Institut für Technische Informatik, Treitlstr. 1-3/182-1, 1040 Vienna, Austria (2001)
Google Scholar
Engelhard, N., Endres, F., Hess, J., Sturm, J., Burgard, W.: Real-time 3D visual SLAM with a hand-held RGB-D camera. In: Proceedings of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden, April 2011
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692
Article MathSciNet Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv e-prints, December 2015
Google Scholar
Karsch, K., Liu, C., Kang, S.B.: Depth extraction from video using non-parametric sampling. In: Proceedings of the 12th European Conference on Computer Vision - Volume Part V. ECCV 2012, pp. 775–788. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_56
Chapter Google Scholar
Kim, P., Chen, J., Cho, Y.K.: SLAM-driven robotic mapping and registration of 3D point clouds. Autom. Constr. 89, 38–48 (2018). http://www.sciencedirect.com/science/article/pii/S0926580517303990
Article Google Scholar
Kim, P., Chen, J., Kim, J., Cho, Y.K.: SLAM-driven intelligent autonomous mobile robot navigation for construction applications. In: Smith, I.F.C., Domer, B. (eds.) Advanced Computing Strategies for Engineering, pp. 254–269. Springer, Cham (2018)
Chapter Google Scholar
Konrad, J., Wang, M., Ishwar, P.: 2D-to-3D image conversion by learning depth from examples. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–22 (2012)
Google Scholar
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. CoRR abs/1606.00373 (2016). http://arxiv.org/abs/1606.00373
Li, B., Shen, C., Dai, Y., van den Hengel, A., He, M.: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (2015)
Google Scholar
Li, B., Dai, Y., Chen, H., He, M.: Single image depth estimation by dilated deep residual convolutional neural network and soft-weight-sum inference. CoRR arxiv:abs/1705.00534 (2017)
Li, J., Klein, R., Yao, A.: Learning fine-scaled depth maps from single RGB images. CoRR abs/1607.00730 (2016). http://arxiv.org/abs/abs/1607.00730
Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
Google Scholar
Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2015). http://arxiv.org/abs/1411.6387
Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 716–723. IEEE Computer Society, Washington, DC (2014). https://doi.org/10.1109/CVPR.2014.97
Mallick, T., Das, P.P., Majumdar, A.K.: Characterizations of noise in kinect depth images: a review. IEEE Sens. J. 14(6), 1731–1740 (2014)
Article Google Scholar
Nathan Silberman, Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: ECCV (2012)
Google Scholar
Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission, pp. 524–530, October 2012
Google Scholar
Roy, A., Todorovic, S.: Monocular depth estimation using neural regression forest. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5506–5514 (2016)
Google Scholar
Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: Weiss, Y., Schölkopf, B., Platt, J.C. (eds.) Advances in Neural Information Processing Systems 18, pp. 1161–1168. MIT Press (2006). http://papers.nips.cc/paper/2921-learning-depth-from-single-monocular-images.pdf
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European Conference on Computer Vision - Volume Part V, ECCV 2012, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Towards unified depth and semantic prediction from a single image. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2800–2809, June 2015. https://doi.org/10.1109/CVPR.2015.7298897
Yu, Y., Song, Y., Zhang, Y., Wen, S.: A shadow repair approach for kinect depth maps. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer Vision - ACCV 2012, pp. 615–626. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgments

This work has been supported by the Spanish Government TIN2016-76515R Grant, supported with Feder funds. Edmanuel Cruz is funded by Panamenian grant for PhD studies IFARHU & SENACYT 270-2016-207. This work has also been supported by a Spanish grant for PhD studies ACIF/2017/243 and FPU16/00887. Thanks to Nvidia also for the generous donation of a Titan Xp and a Quadro P6000.

Author information

Authors and Affiliations

Institute for Computer Research, University of Alicante, P.O. Box 99, 03080, Alicante, Spain
Zuria Bauer, Felix Escalona, Edmanuel Cruz, Miguel Cazorla & Francisco Gomez-Donoso

Authors

Zuria Bauer
View author publications
You can also search for this author in PubMed Google Scholar
Felix Escalona
View author publications
You can also search for this author in PubMed Google Scholar
Edmanuel Cruz
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Cazorla
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Gomez-Donoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Escalona .

Editor information

Editors and Affiliations

Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Raquel Fuentetaja Pizán
Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Ángel García Olaya
Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Maria Paz Sesmero Lorente
Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Jose Antonio Iglesias Martínez
Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Agapito Ledezma Espino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bauer, Z., Escalona, F., Cruz, E., Cazorla, M., Gomez-Donoso, F. (2019). Improving the 3D Perception of the Pepper Robot Using Depth Prediction from Monocular Frames. In: Fuentetaja Pizán, R., García Olaya, Á., Sesmero Lorente, M., Iglesias Martínez, J., Ledezma Espino, A. (eds) Advances in Physical Agents. WAF 2018. Advances in Intelligent Systems and Computing, vol 855. Springer, Cham. https://doi.org/10.1007/978-3-319-99885-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-99885-5_10
Published: 21 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99884-8
Online ISBN: 978-3-319-99885-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Improving the 3D Perception of the Pepper Robot Using Depth Prediction from Monocular Frames

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Depth Perception from Single Monocular Images

Depth Estimation with Ego-Motion Assisted Monocular Camera

3D Environment Reconstruction Using Mobile Robot Platform and Monocular Vision

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving the 3D Perception of the Pepper Robot Using Depth Prediction from Monocular Frames

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Depth Perception from Single Monocular Images

Depth Estimation with Ego-Motion Assisted Monocular Camera

3D Environment Reconstruction Using Mobile Robot Platform and Monocular Vision

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation