Abstract
Uncertain environments, especially uneven lighting and shadows, can degrade an image, which causes a great negative impact on object detection. Moreover, unbalanced training samples can cause overfitting problem. Since available data that is collected at night is much rarer than that collected in the daytime, the nighttime detection effect will be relatively poor. In this paper, we propose a novel data augmentation method named Mask Augmentation, which reduces the brightness and contrast of objects, and also weakens the edge of objects to simulate the degraded scene. In addition, we propose a new architecture, by adding a classification loss branch and a feature extraction module named Multi-Feature Attention Module, which combines the attention mechanism and feature fusion on the basis of Darknet-53. This architecture makes the features extracted in daytime and nighttime images distinguishable. We also increase the loss weight of nighttime images during the training process. We achieved 78.68% mAP on nighttime detection and 73.14% mAP on daytime detection. Compared with other models, our method greatly improves the accuracy of nighttime detection, and also performs satisfactorily on daytime detection. We deployed our model on an intelligent garbage collection robot for real-time detection, which implements automatic picking at night and assists cleaning staff during the day.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Ren S, He K, Girshick R and J. Sun, IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1137 (2015).
Cai Z and Vasconcelos N, Cascade R-CNN: Delving into High Quality Object Detection, arXiv:1712.00726v1, 2018.
Redmon J and Farhadi A, YOLOv3: An Incremental Improvement, arXiv:1804.02767, 2018.
Tian Z, Shen C, Chen H and He T, Fcos: Fully Convolutional One-Stage Object Detection, IEEE International Conference on Computer Vision, 9627 (2019).
Kong T, Sun F, Liu H, Jiang Y and Shi J, Foveabox: Beyond Anchor-Based Object Detector, arXiv:1904.03797, 2019.
Yang Z, Liu S, Hu H, Wang L and Lin S, Reppoints: Point Set Representation for Object Detection, IEEE International Conference on Computer Vision, 9657 (2019).
Proença P F and Simões P, TACO: Trash Annotations in Context for Litter Detection, arXiv:2003.06975, 2020.
Lowe D G, International Journal of Computer Vision 60, 91 (2004).
Platt J, Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines, 1998.
Girshick R, Donahue J, Darrell T and Malik J, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, IEEE Conference on computer vision and pattern recognition, 580 (2014).
Girshick R, Fast R-CNN, IEEE International Conference on Computer Vision, 1440 (2015).
Redmon J, Divvala S, Girshick R and Farhadi A, You Only Look Once: Unified, Real-Time Object Detection, IEEE Conference on Computer Vision and Pattern Recognition, 779 (2016).
Redmon J and Farhadi A, YOLO9000: Better, Faster, Stronger, IEEE Conference on Computer Vision and Pattern Recognition, 7263 (2017).
Lin T Y, Goyal P, Girshick R, He K and Dollár P, Focal Loss for Dense Object Detection, IEEE International Conference on Computer Vision, 2980 (2017).
Zhou X, Wang D and Krähenbühl P, Objects as Points, Computer Vision and Pattern Recognition, arXiv:1904.07850, 2019.
Fan D P, Ji G P, Sun G, Cheng MM and L Shao, Camouflaged Object Detection, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
Fan Q, Zhuo W, Tang C K and Tai Y W, Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
Cao J, Cholakkal H, Anwer R M, Khan F S and Shao L, D2Det: Towards High Quality Object Detection and Instance Segmentation, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
Ioffe S and Szegedy C, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, arXiv:1502.03167, 2015.
Maas A L, Hannun A Y and Ng A Y, Rectifier Nonline-arities Improve Neural Network Acoustic Models, ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 3 (2013).
He K, Zhang X, Ren S and Sun J, Deep Residual Learning for Image Recognition, IEEE Conference on Computer Vision and Pattern Recognition, 770 (2016).
Neubeck A and Van Gool L, Efficient Non-Maximum Suppression, 18th International Conference on Pattern Recognition, 850 (2006).
Lin M, Chen Q and Yan S, Network in Network, arXiv:1312.4400, 2013.
Lin T Y, Maire M, Belongie S, Hays J and Zitnick C, Microsoft COCO: Common Objects in Context, European Conference on Computer Vision, 740 (2014).
Deng J, Dong W, Socher R, Li L and Li F, Imagenet: A Large-Scale Hierarchical Image Database, IEEE Conference on Computer Vision and Pattern Recognition, 248 (2009).
Bochkovskiy A, Wang C Y and Liao H Y M, YOLOv4: Optimal Speed and Accuracy of Object Detection, arXiv:2004.10934, 2020.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by the National Key Research and Development Program of China (No.2018YFB1305200), and the Science Technology Department of Zhejiang Province (No.LGG19F020010).
Rights and permissions
About this article
Cite this article
Liu, S., Shen, J. & Huang, S. Object detection in seriously degraded images with unbalanced training samples. Optoelectron. Lett. 17, 564–571 (2021). https://doi.org/10.1007/s11801-021-1005-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11801-021-1005-6