Multi-network for Joint Detection of Dynamic and Static Objects in a Road Scene Captured by an RGB Camera

Çaldıran, Bekir Eren; Acarman, Tankut

doi:10.1007/978-981-19-4960-9_63

Bekir Eren Çaldıran¹² &
Tankut Acarman¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 383))

447 Accesses

Abstract

This study presents a unified network to localize dynamic, static traffic objects and pedestrians, classify traffic light colors, detect drivable area and lane line simultaneously. In the network architecture, traffic object branch is created to classify dynamic objects such as cars, trucks, buses, motorcycle, and bicycle. Static objects are categorized by traffic sign and traffic light objects. Pedestrians are also localized as a separate traffic object group. Traffic light is classified correctly when it is visible. The network design has a unified architecture, one shared encoder for feature extraction and three decoders for three tasks. For benchmarking purposes, the BDD100K dataset is used. The presented model is ranked in the second place for drivable area segmentation, lane line detection, and inference speed while benchmarking with publicly available multi-networks. In comparison with respect to state-of-the art segmentation models re-trained with BDD100K dataset, the task of dynamic object localization’s MIoU metric is reached to the level of 73.54%, which is 40% higher than the results of re-trained segmentation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fast Segmentation-Based Object Tracking Model for Autonomous Vehicles

Semantic road segmentation using encoder-decoder architectures

Article 13 April 2024

YOLO-U: multi-task model for vehicle detection and road segmentation in UAV aerial imagery

Article 04 June 2024

References

Pan X, Shi J, Luo P, Wang X, Tang X (2018) Spatial as deep: spatial cnn for traffic scene understanding. Proc AAAI conf Artif Intell 32(1)
Google Scholar
Qin Z, Wang H, Li X (2020) Ultra fast structure-aware deep lane detection. arXiv Prepr. arXiv:2004.11757
Hou Y, Ma Z, Liu C, Loy CC (2019) Learning lightweight lane detection CNNs by self attention distillation. In; Proceedings of the IEEE/CVF international conference on computer vision, pp 1013–1021
Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Google Scholar
Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via regionbased fully convolutional networks. arXiv Prepr. arXiv:1605.06409
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), July 2017
Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Yin M, Yao Z, Cao Y, Li X, Zhang Z, Lin S, Hu H (2020) Disentangled non-local neural networks. arXiv Prepr. arXiv: 2006.06668
Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention—MICCAI 2015. Springer International Publishing, Cham, pp 234–241
Chapter Google Scholar
Qian Y, Dolan JM, Yang M (2019) DLT-Net: joint detection of drivable areas, lane lines, and traffic objects. IEEE Trans Intell Transp Syst 21(11):4670–4679
Article Google Scholar
Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: real-time joint semantic reasoning for autonomous driving. arXiv Prepr. arXiv:1612.07695
Wu D, Liao M, Zhang W, Wang (2021) YOLOP: you only look once for panoptic driving perception. arXiv Prepr. arXiv:2108.11250
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: Conference on computer vision and pattern recognition (CVPR)
Google Scholar
Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. arXiv Prepr 2(5):6. arXiv:1805.04687
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv Prepr. arXiv:1409.1556
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252
Article MathSciNet Google Scholar
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss ´for dense object detection. In: Proceedings of the IEEE ınternational conference on computer vision, pp 2980–2988
Google Scholar
The web page presenting drivable area results of networks for benchmarking purposes: https://github.com/SysCV/bdd100k-models/tree/main/drivable
The web page presenting segmentation results of networks for benchmarking purposes: https://github.com/SysCV/bdd100k-models/tree/main/sem_seg
The website for titan-XP GPU, used to compare performance with other GPUs: https://www.nvidia.com/en-us/titan/titan-xp/
The website for tesla-p100 GPU, used to compare performance with other GPUs: https://www.nvidia.com/en-us/data-center/tesla-p100/

Download references

Acknowledgements

This work has been supported by the Scientific Research Projects Commission of Galatasaray University under grant #19.401.005.

Author information

Authors and Affiliations

Galatasaray University, 34349, İstanbul, Türkiye
Bekir Eren Çaldıran & Tankut Acarman

Authors

Bekir Eren Çaldıran
View author publications
You can also search for this author in PubMed Google Scholar
Tankut Acarman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bekir Eren Çaldıran .

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Gnanamani College of Technology, Namakkal, Tamil Nadu, India
G. Ranganathan
Ryerson Communications Lab Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada
Xavier Fernando
ISEG, University of Lisbon, Lisboa, Portugal
Álvaro Rocha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Çaldıran, B.E., Acarman, T. (2023). Multi-network for Joint Detection of Dynamic and Static Objects in a Road Scene Captured by an RGB Camera. In: Ranganathan, G., Fernando, X., Rocha, Á. (eds) Inventive Communication and Computational Technologies. Lecture Notes in Networks and Systems, vol 383. Springer, Singapore. https://doi.org/10.1007/978-981-19-4960-9_63

Download citation

DOI: https://doi.org/10.1007/978-981-19-4960-9_63
Published: 14 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-4959-3
Online ISBN: 978-981-19-4960-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Multi-network for Joint Detection of Dynamic and Static Objects in a Road Scene Captured by an RGB Camera

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Segmentation-Based Object Tracking Model for Autonomous Vehicles

Semantic road segmentation using encoder-decoder architectures

YOLO-U: multi-task model for vehicle detection and road segmentation in UAV aerial imagery

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multi-network for Joint Detection of Dynamic and Static Objects in a Road Scene Captured by an RGB Camera

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Segmentation-Based Object Tracking Model for Autonomous Vehicles

Semantic road segmentation using encoder-decoder architectures

YOLO-U: multi-task model for vehicle detection and road segmentation in UAV aerial imagery

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation