Abstract
Event-based image representations are fundamentally different to traditional dense images. This poses a challenge to apply current state-of-the-art models for object detection as they are designed for dense images. In this work we evaluate the YOLO object detection model on event data. To this end we replace dense-convolution layers by either sparse convolutions or asynchronous sparse convolutions which enables direct processing of event-based images and compare the performance and runtime to feeding event-histograms into dense-convolutions. Here, hyper-parameters are shared across all variants to isolate the effect sparse-representation has on detection performance. At this, we show that current sparse-convolution implementations cannot translate their theoretical lower computation requirements into an improved runtime.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M.: Asynchronous convolutional networks for object detection in neuromorphic cameras. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Gallego, G., Delbruck, T., Orchard, G.M., Bartolozzi, C., Taba, B., Censi, A., Leutenegger, S., Davison, A., Conradt, J., Daniilidis, K., Scaramuzza, D.: Event-based vision: A survey. IEEE Trans. Patt. Anal. Mach. Intell. (2020)
Gehrig, D., Gehrig, M., Hidalgo-Carrió, J., Scaramuzza, D.: Video to events: recycling video datasets for event cameras. IEEE Conf. Comput. Vis. Patt. Recog. (CVPR) (2020)
Geiger, A.: The kitti vision benchmark suite (2017). http://www.cvlibs.net/datasets/kitti/eval_object.php
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. Conf. Comp. Vis. Patt. Recogn. (CVPR) (2012)
Graham, B., Engelcke, M., van der Maaten, L.: 3d Semantic segmentation with submanifold sparse convolutional networks. CVPR (2018)
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neur. Netw. 10(9), 1659–1671 (1997)
Messikommer, N., Gehrig, D., Loquercio, A., Scaramuzza, D.: Event-based asynchronous sparse convolutional networks (2020). http://rpg.ifi.uzh.ch/docs/ECCV20_Messikommer.pdf
Rebecq, H., Ranftl, R., Koltun, V., Scaramuzza, D.: Events-to-video: bringing modern computer vision to event cameras. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mechler, V., Rojtberg, P. (2023). Transferring Dense Object Detection Models To Event-Based Data. In: Nakamatsu, K., Patnaik, S., Kountchev, R., Li, R., Aharari, A. (eds) Advanced Intelligent Virtual Reality Technologies. Smart Innovation, Systems and Technologies, vol 330. Springer, Singapore. https://doi.org/10.1007/978-981-19-7742-8_3
Download citation
DOI: https://doi.org/10.1007/978-981-19-7742-8_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-7741-1
Online ISBN: 978-981-19-7742-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)