Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios

Ke, Xiao; Lin, Xinru; Qin, Liyun

doi:10.1007/s00138-021-01169-7

Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios

Original Paper
Published: 24 February 2021

Volume 32, article number 46, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Machine Vision and Applications Aims and scope Submit manuscript

Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios

Download PDF

632 Accesses
11 Citations
Explore all metrics

Abstract

Pedestrian detection and re-identification technology is a research hotspot in the field of computer vision. This technology currently has issues such as insufficient pedestrian expression ability, occlusion, diverse pedestrian attitude, and difficulty of small-scale pedestrian detection. In this paper, we proposed an end-to-end pedestrian detection and re-identification model in real scenes, which can effectively solve these problems. In our model, the original images are processed with a non-overlapped image blocking data augmentation method, and then input them into the YOLOv3 detector to obtain the object position information. LCNN-based pedestrian re-identification model is used to extract the features of the object. Furthermore, the eigenvectors of the object and the detected pedestrians are calculated, and the similarity between them are used to determine whether they can be marked as target pedestrians. Our method is lightweight and end-to-end, which can be applied to the real scenes.

Pedestrian Detection Based on Fast R-CNN and Batch Normalization

Pedestrian Detection by Using CNN Features with Skip Connection

Dynamic locally connected layer for person re-identification

Article 27 February 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the rapid improvement of computer technology and people's living standards, the world has undergone rapid changes, and artificial intelligence has become more and more integrated into our lives, in the form of technologies such as driverless vehicles, intelligent assisted driving, pedestrian analysis, intelligent robots, intelligent video surveillance, intelligent transportation, security systems, and glasses specially designed for blind people. Pedestrian detection and re-identification technology provide extremely important technical support for these applications, and it is also a core technology. The wide application and importance of this technology make it an important part in computer vision research. Scholars have also carried out a lot of related research in this field. Although many well-known research institutes and key universities are carrying out related research, many issues in pedestrian detection and re-identification, such as low illumination, low contrast, occlusion, and insufficient feature representation, still have to be overcome. Therefore, further research in pedestrian detection and re-identification techniques is needed.

Several methods have been proposed for pedestrian detection. In [1], image features were extracted using the wavelet method, which is the first where a machine learning algorithm has been applied to pedestrian detection. In [2, 3], the authors proposed to use the HOG + SVM algorithm; In [4], the ACF algorithm was proposed; and Nam et al. proposed the Locally Decorated Channel Features (LDCF) method, which improves the performance of the ACF and achieves the best results on the shallow model. Deep learning techniques have also been introduced into the field of pedestrian detection. Krizhevsky et al. [5] using the convolutional neural network (CNN) method to win the ImageNet competition easily, which have attracted wide attention. In [6], the authors proposed the Region-based CNN (R-CNN) method and the authors in [7, 8] designed and improved the Faster R-CNN method based on R-CNN, which has greatly improved performance and speed. In [9], the detection algorithm of single shot multibox detector (SSD) was proposed. The authors in [10] proposed the YOLO algorithm, and then designed and improved the YOLOv2 in [11], whose accuracy rate on the visual object classes (VOC) data set reached 78.6%, and whose FPS reached 40. In 2018, the YOLOv3 [12] was proposed, whose accuracy rate on the VOC data set is nearly 4% higher than the YOLOv2 and, unlike the YOLO and YOLOv2, is sensitive to small objects.

The pedestrian re-identification process is divided into two main parts. First, the pedestrian characteristics are extracted from the picture with good expression and strong robustness, and second, the model with the highest identification and matching ability is used to improve the image. In [38], the proposed algorithm is used to detect the face, and then facial distance measurements are obtained by the geometric-based facial distance measurement technique. Finally, the measured facial distances are evaluated with the physiognomy science to reveal the characteristic properties of person. In [39], the boundary of the face is determined by the feature surface method, and then the corresponding face distance measurement value is determined by the Euclidean distance measurement technology. References [38, 39] both extract facial features, and then evaluating by facial feature distance. The features we extract are different from face recognition, and we extract the body features. For example, in [13], a triangle model is constructed to represent the shape features of human body, and the color features of each part are extracted for pedestrian recognition. In [14], the authors proposed the local maximal occurrence (LOMO) method, which uses the maximum pooling method to process the blocks in the same horizontal strip and uses homomorphic filtering to minimize the influence of illumination changes on the image. The first deep learning algorithm to solve the pedestrian recognition problem are reported in [15, 16]. In [17], multiple large data sets were used as training and Softmax was used as a classification function in the classification network. The classification model has good performance on large data sets like PRW [18] and MARS [19], and does not need to pick sample pairs. Tong Xiao et al. also used the framework of the Faster R-CNN algorithm for pedestrian detection and pedestrian re-identification [20, 21]. Liang Zheng et al. studied the combination of different conventional and deep learning pedestrian detection and pedestrian recognition methods, and proposed that, if the detection frame is good, the identification is directly helpful [18]. The authors in [22] proposed a distance metric learning algorithm for large margin nearest neighbor (LMNN), which improved the measurement effect. In [23], the distance metric algorithm keep it simple and straight (KISS) was proposed, which does not require iterative learning, so it is suitable for large-scale data. In [24], the authors developed the local fisher discrimination analysis (LDAF) and applied it to pedestrian recognition technology. In [40], more attention is paid to semantic information, a data-driven pedestrian re-identification model based on hierarchical semantic representation is proposed, which enhances the semantic representation of features with hierarchical mid-level ‘attributes’. In [41], a kernel-PCA-based spatial descriptor is generated and evaluated the descriptor using known distance metric learning methods.

The main contributions of this paper can be summarized as follows:

1.
To solve the problems of current pedestrian detection research, such as complex or similar background and prospects, diverse perspectives, diverse pedestrian attitudes, and difficulty in detecting small-scale pedestrians, a data augmentation-based YOLOv3 pedestrian detection method is proposed, which involves the following steps: first, a white edge is added to the original image to form a square, which is divided into 9 blocks to obtain 9 sub-pictures; the original picture is then put into the YOLOv3 detector, and the detected image is adopted. The multi-scale fusion method is used for the detection, which significantly improves the detection of small objects. An optimal pedestrian detection model is constructed from the proposed data augmentation-based YOLOv3 detection method, which solves the problems of diverse perspectives, diverse pedestrian attitudes, and difficulty in detecting small-scale pedestrians.
2.
In the research of pedestrian re-identification, there are issues with the appearance of the same pedestrian in multi-lens and the difference in image resolution of different cameras. The existing pedestrian recognition models cannot extract the pedestrian characteristics with strong expression, and cannot distinguish between target pedestrians and non-target pedestrians. We propose a pedestrian recognition model based on lightweight CNN (LCNN), which has been successfully applied to face recognition. The model adopts NIN network, which reduces the parameters and ensures the recognition of the model. The model based on LCNN proposed in this paper improves expression ability of pedestrians and enhances the effect of pedestrian recognition.
3.
Because the pedestrian re-identification technology is relatively rare in real scenes, and the real scenes data is complex and varied, there are many small-scale pedestrians that affect effective pedestrian feature extraction. The existing pedestrian re-identification models used in real scenes need to be improved. In the proposed method, the data augmentation-based YOLOv3 detector is combined with the pedestrian re-identification model based on the LCNN method, and the image size is scaled-up for pedestrians whose size is less than a specified value. The end-to-end framework proposed has improved robustness and accuracy.

The rest of this paper is arranged as follows: Sect. 2 introduces the related work and motivation for the proposed method; Sect. 3 introduces the end-to-end pedestrian detection and re-identification applied to the real scene; Sect. 4 is the experimental section; and in Sect. 5 we summarize the article.

2 Related work

The application of pedestrian detection is extremely extensive and is one of the most important research topics in object detection. The object detection algorithm consists in an object detection algorithm based on the traditional method and one based on deep learning. Felzenszwalb et al. [42] proposed the DPM algorithm in 2008, which is a component-based detection method. The DPM algorithm is generally better than traditional algorithms, but the deep learning-based object detection algorithm is superior in speed and accuracy. At present, there are two main types of deep learning-based object detection algorithms. One of them divides the object detection task into two stages, two-stage methods such as the bounding box algorithm such as Faster R-CNN, and the other treats the object detection algorithm as an end-to-end task, such as the YOLO series algorithm and the SSD algorithm. The specific performance of the object detection algorithm on the VOC2007 and VOC2012 data sets is shown in Table 1.

Table 1 Comparison of object detection algorithm performance on VOC2007 and VOC2012 datasets

Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios

Abstract

Similar content being viewed by others

Pedestrian Detection Based on Fast R-CNN and Batch Normalization

Pedestrian Detection by Using CNN Features with Skip Connection

Dynamic locally connected layer for person re-identification

Explore related subjects

1 Introduction

2 Related work

2.1 Faster R-CNN

2.2 YOLOv3

2.3 Data augmentation-based YOLOv3

2.3.1 Non-overlapping image blocking data augmentation method

2.3.2 Data augmentation post processing based on non-maximum suppression

2.3.3 Data augmented results

2.4 LCNN-based pedestrian re-identification

2.4.1 CNN structure and performance analysis

2.4.2 LCNN

2.4.3 LCNN-based pedestrian re-identification

3 Proposed method

3.1 End-to-end pedestrian detection and re-identification overall process in real scenes

3.2 Pedestrian image scaling in real scenes

4 Experimental methods

4.1 Experiment setup

4.2 Estimation results

4.2.1 Validity of the data augmentation-based YOLOv3 pedestrian detection method

4.2.2 Effectiveness of the LCNN-based pedestrian re-identification method

4.2.3 Effectiveness of end-to-end pedestrian detection and re-identification model based on image scaling data augmentation method

4.3 Test results

4.3.1 Results of the data augmentation-based YOLOv3 pedestrian detection method

4.3.2 Results display of LCNN-based pedestrian recognition method

4.3.3 End-to-end pedestrian detection and re-recognition model based on image scaling data augmentation method

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation