A Study on Indirect Performance Parameters of Object Detection

Arulprakash, Enoch; Martin, A.; Lakshmi, T. Miranda

doi:10.1007/s42979-022-01277-9

A Study on Indirect Performance Parameters of Object Detection

Review Article
Published: 21 July 2022

Volume 3, article number 386, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

SN Computer Science Aims and scope Submit manuscript

A Study on Indirect Performance Parameters of Object Detection

Download PDF

143 Accesses
1 Citation
Explore all metrics

Abstract

Object detection is one of the inevitable tasks in the technological world. When the world started to rely entirely on technological intervention for almost all the tasks, different sectors started to implant artificial intelligence for precise decision making. Object detection is one among the category, which showed its applications in various domains including health care, military and anomaly detection, etc. Since there are many review on object detection, we focus only on the methods which are less expressed but indirectly have a significant performance gain. Notwithstanding, we review predominant methods of object detection including the pre-deep learning era. From the review, we are able to conclude indirect performance parameters of object detector has a significant impact on their performance for different problem scenarios. Finally, we also highlight the best characteristic of object detection in various applications.

A Review of Deep Learning-Based Object Detection Current and Future Perspectives

A Brief Survey on Recent Advances of Object Detection with Deep Learning

Deep Learning Object Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

From many aspects, technological intervention for human problems has changed its face from assisting to complete depending on the technology, especially, after the evolution of artificial intelligence and deep learning. Object detection is one among the task gaining its reputation almost in all the sectors. There are numerous reviews on the area. Therefore, we tried to avoid reassert the same topics again. Instead, we intensify the least expressed attributes of object detection.

The main motive of our study is to highlight indirect parameters of object detection also provide significant acceleration in performance. Moreover, we also briefly review predominant methods including the pre-deep learning era. Further, we tried to draft the best-researched applications of object detection over the decades from various domains.

The manuscript is organized as follows. In second section, it briefly reviews predominant methods, and third section analyzes indirect parameters of object detection. The fourth section drafts best applications of object detection, and the last section draws conclusion.

Review on Predominant Methods

Object detection was carried out based on the template matching and object’s part-based representation [16]. The focus was on a particular object whose object position layout is roughly adamant (like faces). Then, recognition was based on the object’s geometric structure till 1990 [43]. Later, the focus shifted from geometry to the statistical classifier which was based on feature representation [like Adaboost [59], SVM [39] and Neural]. The feature representations through global handcrafted feature extraction-based classifier have set a stage for consecutive research in the ground. The appearance feature representation later shifted from global representation [37] to local representation. The local representation was invariant to geometric transformations: rotation, occlusion, scale, viewpoint and illumination. Representative methods include SIFT [36], Haar-like features [59], shape contexts [6], local binary patterns (LBP), histogram of gradients (HOG) [12] and region covariance. After extracting local features, features are combined either through straightforward concatenation or feature pooling encoders. Through various methods, including bag of visual words [11], spatial pyramid matching and Fisher vectors [42], local hand feature descriptor methods gained a reputation for their invariant ability to geometric transformation.

In the deep learning era feature descriptor for an object, representation is automatically learned from the convolution neural network. Convolution layers of CNN are responsible for feature extraction; later extracted features are learned in the fully connected layers; and finally classification layer assigns class-specific labels. CNN extracts features layer-by-layer initial layer extracts elementary features and deep layers extract more robust features. Features extracted in the initial layers are combined by the deep layers to extract more discriminative features [54, 57].

Object detector uses CNN as backbone for object detection [7]. Predominant methods in the deep learning era include both single- and two-stage detectors [22, 26, 34, 45]. Single-stage detector associates class label and bounding box regressor into a single pipeline that does not associate external or internal object proposal. Commonly, it partitions the input images into a coarse grid, and in each cell, objects are classified and boundings are adjusted. Representative methods include DetectorNet OverFeat [48], YOLO [45] and SSD [35] as shown in Table 1. All these methods are identical which resolves class labels in each cell. However, it differs in simultaneously training bounding box regressor and resolving class labels, and YOLO and SSD are the two important detectors in the single-stage detector. YOLO [45] assigns the probability for all the classes in each cell. The class which obtained the highest probability is considered, and bounding boxes are adjusted with respect to the size of the object. Moreover, classification and bounding box training is carried out parallelly end to end. A single-stage detector has the upper hand in real-time applications like pedestrian detection and other moving objects since it is faster compared to the two-stage detector. SSD incorporated the advantage of YOLO and a two-stage detector to build an object detector as fast as YOLO and as accurate as a two-stage detector (faster RCNN). SSD [34] architecture comprises a fixed convolution size 1*1 with stride two throughout the network. Therefore, each consecutive layer decreases the feature map, and SSD associates classifier and detector in each layer to detect an object of varying size. The two-stage detector, on the other hand, associates a preprocessing object proposal, before resolving class label and bounding box regression. An external region proposal is the computational barrier in two-stage detector. However, these methods are preferred when accuracy is given major preference over speed, as object proposal search for the clues for an object from the image. Therefore, it is effective in identifying even small objects which led to an effective approach for detecting a small object (nanoparticles, cells) and such application we will discuss in “Applications of object detection”. However, the object proposal initially was associated with an external proposal using objectness property which was based on object’s edge, color, texture and gradient. Through this approach, search space is reduced to a great extent. But, the external object proposal was not feasible as it occupies considerable time. Therefore, researchers started to associate object proposals within a DCNN pipeline which increased the performance substantially. Representative methods for two-stage detector include RCNN, fast RCNN [23], and faster RCNN as shown in Table 1. Faster RCNN evolved with object proposal within a DCNN pipeline.

Table 1 Predominant methods of object detection

Full size table

To summarize, both single- and two-stage detectors methods: faster RCNN, YOLO and SSD, are frequently used for various applications as we will discuss in “Applications of object detection”. Faster RCNN is accurate where YOLO is faster. SSD combines both aspects of faster RCNN and YOLO.

Indirect Parameters of Object Detection

The architecture of an object detector plays a key role in the performance as we discussed in “Review on pre-dominant methods”. Two-stage detector associates object proposal before classification and regression as a different architecture from the single-stage detector which roughly divides the input images to coarse grids omitting the proposal. However, the distinct architecture yields different results as a two-stage detector performs by attaining good accuracy. On the other hand, a single-stage detector performs at a good speed. To sum up, the architecture of object detectors plays a key role in the performance of object detection. As there are numerous reviews from both the detection family, our survey tried to avoid reassert the same methods. Instead, we focus on the other parameters apart from architectural design which can contribute to the performance of object detectors. The indirect parameters includes.

Context
Object proposal
Data augmentation
Localization error
Training strategy

Context

Context plays a significant role in object recognition, especially when the represented features are insufficient for prediction, i.e., when the detection framework encounters occlusion, small object or low image quality. Modeling a context provides additional clues for prediction. For instance, for detecting the objects in the kitchen, the possible objects are chimney, gas stove, vessels, cooker, etc.

The context broadly falls into two categories: (a) global context and (b) local context.

a.
Global context: It models an entire scene. Detection in office premises will predict the presence of cubicle and laptop and system. Contextual details are combined with the regular feature representation for final prediction [18].
b.
Local context: It represents the relationship between the objects. Object’s boundary gives additional details about its interaction with other objects. Expanding objects boundary and exploiting in the boundary regions will provide more supplementary information such as object’s above, below, behind, right and left with other objects which provides a additional clue for prediction from its structural constraints. For example, the proposed object can be a door locker if the object behind is a door, and the proposed object is smaller than a door [18].

DCNN exploits contextual details without explicit modeling since the CNN architectural setup enforces hierarchical feature representation. Notwithstanding, dedicated research has been carried out by explicitly modeling local and global context; the representative framework includes CoupleNet [65], ORN [29], DeepIDNet [30], ION [5]. However, in addition to CNNs hierarchical feature representation, both the detectors (single- and two-stage detectors) have an implicit context modeling. In particular, the single-stage detector looks entire image for detection, thereby modeling a global context. In the two-stage detector, the regressor’s subnetwork appropriates the object boundary by exploiting object boundary.

Object Proposal

Object proposal is a preprocessing step before actual detection. In the absence of object proposal, the detector scans different scale and aspect ratios [12, 15, 59, 66] which leads to computation load and makes the entire process very slow. Object proposal eases the detection framework by selectively giving a few proposals [58] from objectness property (edge, texture, color, gradient) [35]. After the growth of DCCN, the selective search was the computation bottleneck for the entire detection framework. It is being proved that DCNN has excellent proficiency in locating an object from their conv layers [46]. Later this idea turned to propose the object within the detection framework. DCNN proposal has a computational advantage over external proposal methods (selective search, MCG and EdgeBoxes [30]) and provides a unified framework. Combining proposal, classification and bounding box regression, the first such method of a proposal using DCNN was the region proposal network (RPN) [46] which combined RPN with RCNN and is a milestone in object detection (faster RCNN) [46]. Consequently, many DCNN-based proposal methods have arrived, representative methods include DeepProposal [19], ZIP [32], DeNet [56], etc., which further improved the performance of object proposal. A two-stage detector with RPN is the key for many detection challenges, including Pascal VOC and COCO. Notwithstanding, DeepProposal [19], ZIP [32], DeNet [56], etc.., have a performance gain in comparison with RPN with slight computation load.

Data Augmentation

Data augmenting refers to artificially stressing the training data to the various transformations. Such as scaling, cropping, rotating, flipping, distorting and adding noise, leaving the underline category unchanged as augmentation produces more training samples, helps in generalization and avoids over fitting [41, 61]. Researchers [14, 24] proposed adding a datasets by pasting segmented objects into realistic images. Further, Dvornik et al. showed [13] that correctly modeling objects local context is a key to place them in the right surrounding [34].

Localization Error

IOU is an evolution matrix for localization whose performance can eventually affect the detection framework. Intersection over union compares the predicted bounding box and ground truth and, ordinally, expected to be more than or equal to 0.5. The bounding box regressor optimizes the bounding area aiming to increase IOU in parallel with classification. Bounding boxes are a coarse estimation. Therefore, background pixels are combined with a bounding box, which affects the performance of localization. Usually, some post-processing step, such as non-maximum suppression [8, 28, 34], is applied to remove inappropriate bounding box. But, the excellent localization can be suppressed due to the wrong alignment. However, few approaches are developed to minimize localization error. Representative methods include MRCNN [20], CRAFT [62], cascade RCNN [9]. In MRCNN, RCNN is applied several times to adjust the boundingbox iteratively. CRAFT [62] and AttractioNet [21] adopts a multistage detection to bring the best proposals, handover to fast RCNN. CaiVasconceolos proposed cascade RCNN, an extension of multistage RCNN, where cascading RCNN is trained sequentially with each RCNN increasing IOU threshold.

Training Strategy

A deep learning detection framework requires massive data to perform well. Moreover, data augmentation is commonly applied during training to alleviate scale variations problems. Training with massive data tends to complicate and overload the process. Effective training and fast convergence are at most concern during training. A few training strategies is proved effective in literature. Singh and Davis proposed SNIP [8, 50,51,52] that introduced an innovative training technique that decreases scale variations without shrinking the training data. Sing et al. proposed SNIPER, which efficiently processes only context area about ground truth by the relevant scale instead of dealing with the entire image pyramid. MiniBatch size plays a key role in past convergence. Peng et al. proposed MegDet that enabled a large MiniBatch size, effective in faster training and rapid convergence. Further, Peng et al. introduced concurrent GPU training that eases the COCO dataset training by finishing the training in four hours by concurrently processing in 128 GPUs, with the help of GPU batch normalization and novel learning rate policy, impressive in winning the COCO 2017 detection challenge [40].

Comparative Analysis of Indirect-Performance

From sections “Context”, “Object proposal”, “Data augmentation”, “Localization error” and “Training strategy”, we have discussed indirect performance parameters and the corresponding representative methods. To highlight the effectiveness of these parameters and the environment where it can yield more performance in comparison with general detectors are analyzed with the recent works as shown in Table 2. Comparison follows the standard evolution sequences as depicted in Fig. 1.

Table 2 Comparative analysis of indirect performance parameters of object detection

Full size table

After the comparative analysis on the indirect performance parameters with different problem scenarios such as insufficient datasets, insufficient feature extraction, detecting small objects, localization error, training massive deep learning models, results from the comparative analysis as shown in Table 2 highlight indirect parameters can significantly contribute to the performance when the generic detector algorithm drops the performance due to the lack of data samples, feature extraction, class imbalance, etc. Moreover, from the comparative analysis we suggest specific parameter for different problem scenarios.

Data augmentation approach: when lack of training data.
Context modeling: Feature extraction is not sufficient or quality of the image is not up to the mark.
Object proposal: When detecting very small or tiny objects.
Effective localization methods: When there is huge class imbalance among the different class in the dataset.
Training strategy: Training huge volume of dataset.

Applications of Object Detection

Object detection has been widely used in numerous applications, especially in the field of medical, military, security, anomaly detection and science and engineering as shown in Fig. 2.

Medical Field

Brain Tumor

Manual segmentation of brain tumors is a laborious and time-consuming task for radiologists. A deep learning paradigm is developed as a feasible preference for applications in medical imaging. They can grasp discerning features instinctively, as a neural network can learn a brain’s essential features in regulation to classify and segment tumors. This approach outperforms manual segmentation and the classical machine learning approach in stipulations of false-positive decline. Among the deep learning established ways, CNNs have provided supreme performance for brain tumor segmentation [44].

Radiolucent Lesions

Identification and segmentation of mandibular radiolucent lesions on panoramic radiographs: It aims at five radiolucent lesions (radicular cysts, dentigerous cysts, ameloblastomas, odontogenickeratocysts and simple bone cysts) which takes place regularly in the mandible. A deep learning approach had proven a high standard of detection and classification awareness in the detection of radiolucent lesions of the mandible [3].

Cell Biology

Segmenting a cell from blood or other tissue has significant challenges due to morphology, color intensity, and cell size variability but with the precise accuracy of deep vision. It has displayed outstanding accuracy in classifying and detecting B cells and T cells, detached by a micro-microfluidic chip [55].

Security Field

Luggage Scanner

In the international travel web, the increased traveller throughput and enlarged border security (e.g., postal, sailing and freight). The consequences in demanding a well-timed computerized image identification. Convolution neural networks (CNN), a leading in modern object detection problems, are also used in X-ray baggage images for identifying a potential object of threat (gun, shuriken, razor blade and knife) objects. The research result highlights that CNN achieves exceptional accuracy in detecting threat objects [2].

Anomaly Detection

Identification of Defects in Tiny Particles

Tiny tools (less than 1 cm) particles can flaw due to working conditions and poor design. Due to holes, sags, and abrasions, mass-produced products are prone to fault. Decay and lethargy damage arise in day-to-day functions; deep learning’s effective feature extraction capability is utilized to identify tiny particles defects. An SSD object detector proved to detect the flaw of 0.8-cm darning needles accurately [63].

Anomaly in Steel Structure

Bolt loosing will affect the safety of the steel arrangement and may lead to severe accidents. Due to the bolt joints, complicated fluctuation properties, it is hard to recognize the bolt loosening in steel structures from a conventional dynamics perspective. However, deep learning’s intense feature extraction capability is used to detect the screw and screw number. From the detected set, bolt loosening is effectively identified using trigonometric relationships [64].

Anomaly in Food Particles

The identification of an external object plays an essential part in the agriculture commodity. In various ways, external particles are brought into food brands; foreign objects in foods are the sole means of customer complaints. The identification of external particles is hugely significant for quality and health. It is a major concern for the food security convention. Manual extraction of foreign objects from food material is a time-consuming and labors task. Foreign object segregation from walnuts using DCNN is applied. DCNN overcomes the cumulating phenomenon between walnuts and external particles, which was a challenging task in manual feature engineering. The DCNN performed with above 99% in more than 100 test images [47].

Science and Engineering

Nanoparticles Segmentation

Images produced from the microscopes are large in number and resolution. Shapes and size distributions and properties of nanoparticles play an essential part in interpreting the material. Hence, each image of nanoparticles should be identified and detected for giving a measurable guide. These particles segmentation is challenging because of the overlapping instances, changeable particle sizes and shapes. Moreover, manual detection and segmentation of nanoparticles are laborious and time-consuming. A deep learning paradigm is used for detecting and segmenting the nanoparticles from TEM (transmission electron microscopy) images. Multiple output convolution neural networks (MO-CNN) are used, for concurrent recognition and segmentation of nanoparticles. The proposed deep learning approach is powerful and efficient, with immense precision and capable of studying nanoparticles, even in overlapping particles and complex backgrounds [38].

Military

Surveillance

Surveillance plays a crucial role in the continuous analysis of massive amounts of critical visual information. Detecting targets, monitoring security-sensitive areas, and suspects possible suspicious activities lead to an increase in cognitive load and exhaustion in energy level; moreover, it is prone to error. The best alternative is to replace, with a computer vision, to detect suspects and monitor with ease. Furthermore, the probability of error will be lesser [4].

Conclusions

This survey briefly reviewed predominant methods of object detection from pre-deep learning methods. Most importantly, we have given preference for the indirect parameters. From the review and comparative analysis, we conclude indirect performance parameters plays a crucial role in various problems across different areas by boosting the performance in comparison with generic detector. Furthermore, we have highlighted which parameters can be the appropriate choice for different problem conditions.

Moreover, we have shown the best characteristics of object detection in various domains. Results from the various applications showed deep learning methods outperformed conventional methods and goes behind the human visual perception. Therefore, the transition of technological intervention from assisting to completely depending brought serious anxiety on the future role of human intervention for various tasks. Nevertheless, it is essential to associate with the evolving technology to give continuity against the fast-evolving machine-centric period.

References

Aghnia Farda N, Lai JY, Wang JC, Lee PY, Liu JW, Hsieh IH. Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques. Injury. 2021;52(3):616–24. https://doi.org/10.1016/j.injury.2020.09.010.
Article Google Scholar
Akcay S, Kundegorski ME, Willcocks CG, Breckon TP. Using deep convolutional neural network architectures for object classification and detection within x-ray baggage security imagery. IEEE Trans Inf Forensics Secur. 2018;13(9):2203–15. https://doi.org/10.1109/TIFS.2018.2812196.
Article Google Scholar
Ariji Y, Yanashita Y, Kutsuna S, Muramatsu C, Fukuda M, Kise Y, Ariji E. Automatic detection and classification of radiolucent lesions in the mandible on panoramic radiographs using a deep learning object detection technique. Oral Surg Oral Med Oral Pathol Oral Radiol. 2019;00(00):1–7. https://doi.org/10.1016/j.oooo.2019.05.014.
Article Google Scholar
Arulprakash E, Aruldoss M. A study on fight against COVID-19 from Latest Technological Intervention. SN Comput Sci. 2020. https://doi.org/10.1007/s42979-020-00301-0.
Article Google Scholar
Bell S, Zitnick CL, Bala K, Girshick R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. pp. 2874–2883.
Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell. 2002;24(4):509–22.
Article Google Scholar
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35:1798–828. https://doi.org/10.1109/TPAMI.2013.50.
Article Google Scholar
Bodla N, Singh B, Chellappa R, Davis LS. Soft-NMS--improving object detection with one line of code. In: Proceedings of the IEEE international conference on computer vision, pp. 5561–5569. 2017.
Cai Z, Vasconcelos N. Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. pp. 6154–6162.
Chen G, Chen K, Zhang L, Zhang L, Knoll A. VCANet: vanishing-point-guided context-aware network for small road object detection. Autom Innov. 2021. https://doi.org/10.1007/s42154-021-00157x.
Article Google Scholar
Csurka G, Dance C, Fan L, Willamowski J, Bray C. Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV (Vol. 1, No. 1–22, pp. 1–2). 2004.
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886–893). 2005.
Dvornik N, Mairal J, Schmid C. Modeling visual context is key to augmenting object detection datasets. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018. pp. 364–380.
Dwibedi D, Misra I, Hebert M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE international conference on computer vision, 2017. pp. 1301–1310.
Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition. 2008. pp. 1–8. Ieee.
Fischler MA, Elschlager RA. The representation and matching of pictorial structures. IEEE Trans Comput. 1973;100(1):67–92.
Article Google Scholar
Fu H, Fan X, Yan Z, Du X. Detection of schools in remote sensing images based on attention-guided dense network. ISPRS Int J Geo Inf. 2021;10(11):736.
Article Google Scholar
Galleguillos C, Belongie S. Context based object categorization: a critical survey. Comput Vis Image Underst. 2010;114(6):712–22.
Article Google Scholar
Ghodrati A, Diba A, Pedersoli M, Tuytelaars T, Van Gool L. Deepproposal: Hunting objects by cascading deep convolutional layers. In: Proceedings of the IEEE international conference on computer vision, 2015. pp. 2578–258.
Gidaris S, Komodakis N. Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision. 2015. pp. 1134–1142.
Gidaris S, Komodakis N. Attend refine repeat: active box proposal generation via in-out localization. 2016. arXiv preprint arXiv:1606.04446.
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. pp. 580–587.
Girshick R. Fast R-CNN. In:Proceedings of the IEEE International Conference on Computer Vision, 2015. pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169.
Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 2315–2324.
Han X. Modified cascade RCNN based on contextual information for vehicle detection. Sens Imaging. 2021;22(1):1–19. https://doi.org/10.1007/s11220-021-00342-6.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 8691 LNCS(PART 3). 2014.pp. 346–361. https://doi.org/10.1007/978-3-319-10578-9_23.
Hosang J, Benenson R, Schiele B. Learning non-maximum suppression. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 4507–4515.
Hu H, Gu J, Zhang Z, Dai J, Wei Y. Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 3588–3597.
Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y. RON: Reverse connection with objectness prior networks for object detection. In: Proceedings—30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-Janua. 2017. pp. 5244–5252. https://doi.org/10.1109/CVPR.2017.557.
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD. Backpropagation applied to digit recognition. Neural Comput. 1989;1:541–51.
Article Google Scholar
Lenc K, Vedaldi A. Understanding image representations by measuring their equivariance and equivalence. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. pp. 991999.
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M. Deep learning for generic object detection: a survey. Int J Comput Vis. 2020;128(2):261–318.
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. SSD: Single shot multibox detector. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 9905 LNCS. 2016. Pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. SSD: Single shot multibox detector. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 9905 LNCS. 2016.pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
Lowe D. Object recognition from local scale-invariant features. In: Proceedings of the IEEE international conference on computer vision, 2. 2001.
Murase H, Nayar SK. Visual learning and recognition of 3-D objects from appearance. Int J Comput Vision. 1995;14(1):5–24.
Article Google Scholar
Oktay AB, Gurses A. Automatic detection, localization and segmentation of nano-particles with deep learning in microscopy images. Micron. 2019;120:113–9. https://doi.org/10.1016/j.micron.2019.02.009.
Article Google Scholar
Venkatesan C, Karthigaikumar P, Paul A, Satheeskumaran S, Kumar R. ECG signal pre-processing and SVM classifier-based abnormality detection in remote healthcare applications. IEEE Access. 2018;6:9767–73.
Article Google Scholar
Peng C, Xiao T, Li Z, Jiang Y, Zhang X, Jia K, Sun J Megdet: A large mini-batch object detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 6181–6189.
Peng X, Sun B, Ali K, Saenko K. Learning deep object detectors from 3d models. In: Proceedings of the IEEE international conference on computer vision. 2015. pp. 1278–1286.
Perronnin F, Sánchez J, Mensink T. Improving the fisher kernel for large-scale image classification. In: European conference on computer vision. 2010. pp. 143–156. Springer, Berlin, Heidelberg.
Ponce J, Hebert M, Schmid C, Zisserman A (eds). Toward category-level object recognition, Vol. 4170. Springer. 2007.
Razzak I, Imran M, Xu G. Efficient Brain tumor segmentation with multiscale two-pathway-group conventional neural networks. IEEE J Biomed Health Inf 1:1. https://doi.org/10.1109/JBHI.2018.2874033.
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2016;779–788. https://doi.org/10.1109/CVPR.2016.91.
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017;39(6):1137–49. https://doi.org/10.1109/TPAMI.2016.2577031.
Article Google Scholar
Rong D, Xie L, Ying Y. Computer vision detection of foreign objects in walnuts using deep learning. Comput Electron Agric. 2019;162(February):1001–10. https://doi.org/10.1016/j.compag.2019.05.019.
Article Google Scholar
Rolet P, Sebag M, Teytaud O. Integrated recognition, localization and detection using convolutional networks. In: Proceedings of the ECML conference. 2012. pp. 1255–1263.
Singh B, Davis LS. An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 3578–3587.
Singh B, Li H, Sharma A, Davis LS. R-fcn-3000 at 30fps: Decoupling detection and classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 1081–1090.
Singh B, Najibi M, Davis LS. Sniper: Efficient multi-scale training. 2018. arXiv preprint arXiv:1805.09300.
Singh B, Najibi M, Sharma A, Davis LS. Scale normalized image pyramids with autofocus for object detection. IEEE Trans Pattern Anal Mach Intell. 2021.
Siris A, Jiao J, Tam GK, Xie X, Lau RW. Scene context-aware salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. pp. 4156–4166.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A. Going deeper with convolutions. IEEE Conf Comput Vis Pattern Recognit. 2015. https://doi.org/10.1109/CVPR.2015.7298594.
Article Google Scholar
Turan B, Masuda T, Noor AM, Horio K, Saito TI, Miyata Y, Arai F. High accuracy detection for T-cells and B-cells using deep convolutional neural networks. ROBOMECH J 2018;5(1). https://doi.org/10.1186/s40648-018-0128-4
Tychsen-Smith L, Petersson L Denet: Scalable real-time object detection with directed sparse sampling. In: Proceedings of the IEEE international conference on computer vision. 2017. pp. 428–436.
Tygert M, Bruna J, Chintala S, LeCun Y, Piantino S, Szlam A. A mathematical motivation for complex-valued convolutional networks. Neural Comput. 2016;28(5):815–25. https://doi.org/10.1162/NECO_a_00824.
Article MathSciNet MATH Google Scholar
Vaillant R, Monrocq C, Le Cun Y. Original approach for the localisation of objects in images. IEE Proc Vis Image Signal Process. 1994;141(4):245–50.
Article Google Scholar
Viola P, Jones M. Managing work role performance: challenges for twenty-first century organizations and their employees. Rapid Object Detection Using a Boosted Cascade of Simple Features. 2001. https://doi.org/10.1109/CVPR.2001.990517.
Article Google Scholar
Wang R, Jiao L, Xie C, Chen P, Du J, Li R. S-RPN: sampling-balanced region proposal network for small crop pest detection. Comput Electron Agric. 2021;187: 106290.
Article Google Scholar
Wang X, Shrivastava A, Gupta A. A-fast-rcnn: Hard positive generation via adversary for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 2606–2615.
Yang F, Choi W, Lin Y. Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 2129–2137.
Yang J, Li S, Wang Z, Yang G. Real-time tiny part defect detection system in manufacturing using deep learning. IEEE Access. 2019;7:89278–91. https://doi.org/10.1109/access.2019.2925561.
Article Google Scholar
Zhao X, Zhang Y, Wang N. Bolt loosening angle detection technology using deep learning. Struct Control Health Monit. 2019;26(1):1–14. https://doi.org/10.1002/stc.2292.
Article Google Scholar
Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H. Couplenet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE international conference on computer vision. 2017. pp. 4126–4134.
Zitnick CL, Dollár P. Edge boxes: Locating object proposals from edges. In: European conference on computer vision, 2014. pp. 391–405. Cham: Springer.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Central University of Tamil Nadu, Thiruvarur, 610104, India
Enoch Arulprakash & A. Martin
Department of Computer Science, St. Joseph’s College of Arts and Science, Cuddalore, 607001, India
T. Miranda Lakshmi

Authors

Enoch Arulprakash
View author publications
You can also search for this author in PubMed Google Scholar
A. Martin
View author publications
You can also search for this author in PubMed Google Scholar
T. Miranda Lakshmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enoch Arulprakash.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M. Shivakumar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arulprakash, E., Martin, A. & Lakshmi, T.M. A Study on Indirect Performance Parameters of Object Detection. SN COMPUT. SCI. 3, 386 (2022). https://doi.org/10.1007/s42979-022-01277-9

Download citation

Received: 05 April 2022
Accepted: 01 July 2022
Published: 21 July 2022
DOI: https://doi.org/10.1007/s42979-022-01277-9

A Study on Indirect Performance Parameters of Object Detection

Abstract

Similar content being viewed by others

A Review of Deep Learning-Based Object Detection Current and Future Perspectives

A Brief Survey on Recent Advances of Object Detection with Deep Learning

Deep Learning Object Detection

Explore related subjects

Introduction

Review on Predominant Methods

Indirect Parameters of Object Detection

Context

Object Proposal

Data Augmentation

Localization Error

Training Strategy

Comparative Analysis of Indirect-Performance

Applications of Object Detection

Medical Field

Brain Tumor

Radiolucent Lesions

Cell Biology

Security Field

Luggage Scanner

Anomaly Detection

Identification of Defects in Tiny Particles

Anomaly in Steel Structure

Anomaly in Food Particles

Science and Engineering

Nanoparticles Segmentation

Military

Surveillance

Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation