Abstract
The effectiveness of deep learning networks in detecting small objects is limited, thereby posing challenges in addressing practical object detection tasks. In this research, we propose a small object detection model that operates at multiple scales. The model incorporates a multi-level bidirectional pyramid structure, which integrates deep and shallow networks to simultaneously preserve intricate local details and augment global features. Moreover, a dedicated multi-scale detection head is integrated into the model, specifically designed to capture crucial information pertaining to small objects. Through comprehensive experimentation, we have achieved promising results, wherein our proposed model exhibits a mean average precision (mAP) that surpasses that of the well-established you only look once version 7 (YOLOv7) model by 1.1%. These findings validate the improved performance of our model in both conventional and small object detection scenarios.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
LIAO Y R, WANG H N, LIN C B, et al. Research progress of optical remote sensing image target detection based on deep learning[J]. Journal on communications, 2022, 43(5): 190–203.
ZHANG T, LI Z, SUN Z, et al. A fully convolutional anchor-free object detector[J]. The visual computer, 2023, 39(2): 569–580.
MOHAMMADKARIMI M, MEHRABI M, ARDAKANI M, et al. Deep learning-based sphere decoding[J]. IEEE transactions on wireless communications, 2019, 18(9): 4368–4378.
LI Z, GUO Q, SUN B, et al. Small object detection methods in complex background: an overview[J]. International journal of pattern recognition and artificial intelligence, 2023, 37(2): 2350002.
LI R, HU J, LI S, et al. Blind detection of communication signals based on improved YOLO3[C]//2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), April 9–11, 2021, Xi’an, China. New York: IEEE, 2021: 424–429.
VARA P N R S, D’SOUZAKEVIN B, BHARGAVAVIJAY K. A downscaled faster-RCNN framework for signal detection and time-frequency localization in wideband RF systems[J]. IEEE transactions on wireless communications, 2020, 19(7): 4847–4862.
WAN Y, LIAO Z, LIU J, et al. Small object detection leveraging density-aware scale adaptation[J]. The photogrammetric record, 2023, 38(182): 160–175.
QIN H, WU Y, DONG F, et al. Dense sampling and detail enhancement network: improved small object detection based on dense sampling and detail enhancement[J]. IET computer vision, 2022, 16(4): 307–316.
XIAO Z H, DONG E Z, TONG J G, et al. Light weight object detector based on composite attention residual network and boundary location loss[J]. Neurocomputing, 2022, 494: 132–147.
ZHANG S F, WANG Q, ZHU T, et al. Detection and classification of small traffic signs based on cascade network[J]. Chinese journal of electronics, 2021, 30(4): 727–735.
CHEN S, LI Z, TANG Z. Relation R-CNN: a graph based relation-aware network for object detection[J]. IEEE signal processing letters, 2020, 27: 1680–1684.
XU D, GUAN J, FENG P, et al. Association loss for visual object detection[J]. IEEE signal processing letters, 2020, 27: 1435–1439.
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 27–30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 779–788.
REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 7263–7271.
REDMON J, FARHADI A. YOLOV3: an incremental improvement[EB/OL]. (2018-04-08) [2023-09-05]. https://arxiv.org/abs/1804.02767.
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOV4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2023-09-05]. https://arxiv.org/abs/2004.10934.
JIN H, SONG Q, HU X. Auto-Keras: efficient neural architecture search with network morphism[EB/OL]. (2018-06-27) [2023-09-05]. https://arxiv.org/abs/1806.10282v2.
TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 10778–10787.
LI F, GAO D, YANG Y, et al. Small target deep convolution recognition algorithm based on improved YOLOv4[J]. International journal of machine learning and cybernetics, 2023, 14(2): 387–394.
BOSQUET B, CORES D, SEIDENARI L, et al. A full data augmentation pipeline for small object detection based on generative adversarial networks[J]. Pattern recognition: the journal of the pattern recognition society, 2023, 133: 108998–109010.
YANG Z, YU H, FENG M, et al. Small object augmentation of urban scenes for real-time semantic segmentation[J]. IEEE transactions on image processing, 2020, 29: 5175–5190.
LEE G, HONG S, CHO D. Self-supervised feature enhancement networks for small object detection in noisy images[J]. IEEE signal processing letters, 2021, 28: 1026–1030.
ZHANG H, DU Q, QI Q, et al. A recursive attention-enhanced bidirectional feature pyramid network for small object detection[J]. Multimedia tools and applications, 2023, 82(9): 13999–14018.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Additional information
This work has been partially supported by the National Natural Science Foundation of China (No.62103298), and the South African National Research Foundation (Nos.132797 and 137951).
Rights and permissions
About this article
Cite this article
Zhu, Y., Yang, S., Tong, J. et al. Multi-scale detector optimized for small target. Optoelectron. Lett. 20, 243–248 (2024). https://doi.org/10.1007/s11801-024-3126-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11801-024-3126-1