Triplet Ratio Loss for Robust Person Re-identification

Hu, Shuping; Wang, Kan; Cheng, Jun; Tan, Huan; Pang, Jianxin

doi:10.1007/978-3-031-18907-4_4

Shuping Hu¹⁵,
Kan Wang^15,16,
Jun Cheng¹⁷,
Huan Tan¹⁵ &
…
Jianxin Pang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13534))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

3190 Accesses
2 Citations

Abstract

Triplet loss has been proven to be useful in the task of person re-identification (ReID). However, it has limitations due to the influence of large intra-pair variations and unreasonable gradients. In this paper, we propose a novel loss to reduce the influence of large intra-pair variations and improve optimization gradients via optimizing the ratio of intra-identity distance to inter-identity distance. As it also requires a triplet of pedestrian images, we call this new loss as triplet ratio loss. Experimental results on four widely used ReID benchmarks, i.e., Market-1501, DukeMTMC-ReID, CUHK03, and MSMT17, demonstrate that the triplet ratio loss outperforms the previous triplet loss.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A Hybrid Deep Model for Person Re-Identification

Person re-identification by the asymmetric triplet and identification loss function

Article 14 September 2017

Modified centroid triplet loss for person re-identification

Article Open access 22 May 2023

Keywords

1 Introduction

The goal of person re-identification (ReID) is to identify a person of interest using pedestrian images captured across disjoint camera views. Due to its widely deployment in real-world applications such as intelligent surveillance, ReID has become an important topic [6, 12, 14, 32, 33, 37,38,39, 42, 48, 52].

The key to robust ReID lies in high-quality pedestrian representations. However, due to the presence of detection errors, background occlusions and variations on poses, extracting discriminative pedestrian representations in ReID is still challenging [18, 25, 27, 31, 34]. Previous methods typically adopt two different strategies to improve the quality of pedestrian representations for ReID. The first strategy involves equipping the deep network with various modules to enhance the discriminative ability of extracted features [2, 5, 27, 30, 32]. In comparison, the second strategy leaves the network unchanged and designs loss function that directly optimizes the extracted features. Existing loss functions for ReID can be categorized into two groups: classification-based and distance-based loss functions [4, 24, 26, 28, 29], of which the triplet loss [24] is the most popular one. In brief, the triplet loss typically optimizes pedestrian features via maximizing the inter-identity distances and minimizing the intra-identity distances.

However, the triplet loss has two inherent problems. First, the quality of the pedestrian features optimized with triplet loss is heavily affected by intra-pair variations due to the fixed margin. More specifically, first, for the three triplets presented in Fig. 1, the distance of the negative pair is moderate, small and large, respectively. This means a reasonable margin for the triplet in Fig. 1(a) is inappropriate for triplets in both Fig. 1(b) and Fig. 1(c). This is because the triplet constraint becomes too tight or loose for the two triplets, respectively. Second, the distance of the negative pair is small for triplet in Fig. 1(b); therefore triplet loss might result in a collapsed ReID model when using an improper triplet sampling strategy. From the mathematical point of view [36], this is because the triplet loss only gives slight repelling gradient^{Footnote 1} for hard negative image while gives large attracting gradient (see footnote 1) for hard positive image. Therefore, the embeddings of all pedestrian images will shrink to the same point.

Given the above, it is induced that an effective loss function is needed to adjust the margin according to the respective triplet and provide more reasonable gradient during the training stage. Accordingly, in this paper, we propose a novel loss called, triple ratio loss.

First, different from the triplet loss which optimizes the “difference” between intra-identity distance and inter-identity distance, in brief, the proposed triplet ratio loss directly optimizes the “ratio” of the intra-identity distance to inter-identity distance. More specifically, for a triplet, the triplet ratio loss requires the “ratio” to be smaller than a pre-defined hyper-parameter. Based on the goals and the approach, we name it as “triplet ratio loss”. Intuitively, as shown in Fig. 2(a), $\{A, P_0, N_0\}$ which denotes a triplet (A, $P_0$, $N_0$ represents the anchor, positive, negative image in the triplet, respectively) is active but hard to be optimized for triplet loss since the intra-identity distance is larger than the decision boundary (The left boundary of the red rectangle). Besides, $\{A, P_2, N_2\}$ has no contribution to triplet loss since the intra-identity distance is already smaller than the respective decision boundary. In comparison, as illustrated in Fig. 2(b), the proposed triplet ratio loss is able to relax the tight constraint for triplet $\{A, P_0, N_0\}$ and tighten the loose constraint for triplet $\{A, P_2, N_2\}$.

Second, the proposed triplet ratio loss improves the convergence of the ReID model. Compared with triplet loss, the triplet ratio loss drives the gradients of features in the same triplet to be adjusted adaptively in a more reasonable manner. More specifically, the triplet ratio loss adjusts the gradients for the anchor image, positive image and negative image considering both the inter-identity distance and intra-identity distance. Therefore, the triplet ratio loss gives larger repelling gradient for hard negative pair. Consequently, the proposed triplet ratio loss encourages the ReID model easier to converge and is able to prevent ReID models from shrinking to the same point [36].

The contributions of this work can be summarized as follows:

We study the two problems that related to triplet loss for robust ReID: 1) intra-pair variations and 2) unreasonable gradients.
We propose the triplet ratio loss to address aforementioned problems via optimizing the ratio of the intra-identity distance to inter-identity distance.
Extensive experimental results on four widely used ReID benchmarks have demonstrated the effectiveness of the proposed triplet ratio loss.

2 Related Work

We divide existing deep learning-based ReID methods into two categories, i.e., deep feature learning or deep metric learning-based methods, according to the way adopted to improve pedestrian representations.

Deep Feature Learning-Based Methods. Methods within this category target on learning discriminative representations via designing powerful backbone. In particular, recent methods typically insert attention modules into the backbone model to enhance the representation power [2, 5]. Besides, part-level representations that sequentially perform body part detection and part-level feature extraction have been proven to be effective for ReID, as the part features contain fine-grained information [27, 30].

Deep Metric Learning-Based Methods. Methods within this category can be further categorized into two elemental learning paradigms, i.e., optimizing via class-level labels or pair-wise labels. Methods using class-level labels, e.g., the softmax loss [17, 29, 40], typically learn identity-relevant proxies to represent different identities of samples. In comparison, the methods based on pair-wise labels usually enhance the quality of pedestrian representations via explicitly optimizing the intra-identity and inter-identity distances [4, 24, 26, 29]. As one of the most popular method that using pair-wise labels, the triplet loss [24] has been proven to be effective in enlarging the inter-identity distances and improving the intra-identity compactness.

However, triplet loss has three inherent limitations [10, 24, 36]. First, the performance of triplet loss is influenced by the triplet sampling strategy. Therefore, many attempts have been made to improve the triplet selection scheme [36, 41], e.g., the distance-weighted sampling and correction. Second, the intra-identity features extracted by models are not sufficiently compact, which leads to the intra-pair variations. Accordingly, some methods [3, 11] have been proposed to constraint the relative difference between the intra-identity distance and inter-identity distance, and push the negative pair away with a minimum distance. Third, the choice of the constant margin is also vital for the learning efficiency. More specifically, too large or too small value for the margin may lead to worse performance. To this end, subsequent methods was proposed to adjust the margin based on the property of each triplet [8, 44]. However, these variants of triplet loss still methodically adopt the margin-based manner [3, 8, 11, 44].

In comparison, the proposed triplet ratio loss introduce a novel ratio-based mechanism for optimization: optimizing the ratio of intra-identity distance to inter-identity distance. Therefore, the triplet ratio loss could adaptively adjust the margin according to respective triplet so as to weaken the influence of intra-pair variations. Besides, the optimization gradients are also improved to be more reasonable by the proposed triplet ratio loss; therefore the ReID model enjoys faster convergence and more compact convergence status.

3 Method

3.1 Triplet Loss

As one of the most popular loss function for ReID, the triplet loss [24] aims at improving the quality of pedestrian representations via maximizing the inter-identity distances and minimizing the intra-identity distances. More specifically, the triplet loss is formulated as follows:

$$\begin{aligned} \mathcal {L}_\mathrm{{triplet}} = \sum _{a,p,n \in \mathcal {N}}{\left[ D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) - D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) + \alpha \right] _{+}}. \end{aligned}$$

(1)

Here $\alpha $ is the margin of the triplet constraint, and $\mathcal {N}$ indicates the set of sampled triplets. $\textbf{f}_{i}^{a}$, $\textbf{f}_{i}^{p}$, $\textbf{f}_{i}^{n}$ represent the feature representations of the anchor image, positive image and negative image within a triplet, respectively. $D(\textbf{x}, \textbf{y}) = {\parallel \textbf{x} - \textbf{y}\parallel }_{2}^{2}$ represents the distance between embedding $\textbf{x}$ and embedding $\textbf{y}$. $\left[ \cdot \right] _{+} = \max (0, \cdot )$ denotes the hinge loss. During the optimization, the derivatives for each feature representations are computed as follows:

$$\begin{aligned} \begin{aligned} \frac{\partial {\mathcal {L}_\mathrm{{triplet}}}}{\partial {\textbf{f}_{i}^{a}}} = 2(\textbf{f}_{i}^{n} - \textbf{f}_{i}^{p}),~~~ \frac{\partial {\mathcal {L}_\mathrm{{triplet}}}}{\partial {\textbf{f}_{i}^{p}}} = 2(\textbf{f}_{i}^{p} - \textbf{f}_{i}^{a}),~~~ \frac{\partial {\mathcal {L}_\mathrm{{triplet}}}}{\partial {\textbf{f}_{i}^{n}}} = 2(\textbf{f}_{i}^{a} - \textbf{f}_{i}^{n}). \end{aligned} \end{aligned}$$

(2)

However, the triplet loss simply focuses on obtaining correct order for each sampled triplet, it therefore suffers from large intra-pair variations [26] and unreasonable repelling gradient for $\textbf{f}_{i}^{n}$ [36].

3.2 Triplet Ratio Loss

To address the aforementioned two drawbacks, we propose the triplet ratio loss which optimizes triplets from a novel perspective. In brief, the triplet ratio loss directly optimizes the “ratio” of the intra-identity distance to inter-identity distance. More specifically, the triplet ratio loss is formulated as:

$$\begin{aligned} \mathcal {L}_\mathrm{{tri\_ratio}} = \sum _{a,p,n \in \mathcal {N}}{\left[ \frac{D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) }{D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) } - \beta \right] _{+}}, \end{aligned}$$

(3)

where $\beta \in (0, 1)$ is the hyper-parameter of the triplet ratio constraint.

During the optimization, the derivatives of the triplet ratio loss with respect to $\textbf{f}_{i}^{a}$, $\textbf{f}_{i}^{p}$, $\textbf{f}_{i}^{n}$ are:

$$\begin{aligned} \begin{aligned}&\frac{\partial {\mathcal {L}_\mathrm{{tri\_ratio}}}}{\partial {\textbf{f}_{i}^{a}}} = \frac{-2D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) \cdot (\textbf{f}_{i}^{p} - \textbf{f}_{i}^{a})-2D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \cdot (\textbf{f}_{i}^{a} - \textbf{f}_{i}^{n})}{{\left[ D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) \right] ^{2}}}, \\&\frac{\partial {\mathcal {L}_\mathrm{{tri\_ratio}}}}{\partial {\textbf{f}_{i}^{p}}} = \frac{2(\textbf{f}_{i}^{p} - \textbf{f}_{i}^{a})}{D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) }, \\&\frac{\partial {\mathcal {L}_\mathrm{{tri\_ratio}}}}{\partial {\textbf{f}_{i}^{n}}} = \frac{2D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \cdot (\textbf{f}_{i}^{a} - \textbf{f}_{i}^{n})}{{\left[ D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) \right] ^{2}}}. \end{aligned} \end{aligned}$$

(4)

Addressing the intra-pair variations. Compared with triplet loss, the triplet ratio loss handles the intra-pair variations via adjusting the constraint for $\textbf{f}_{i}^{p}$ according to $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $. More specifically, first, the triplet ratio loss relatively relaxes the constraint on the intra-identity pair when $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $ is small. Second, it encourages the constraint on the intra-identity pair to be tighter, when $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $ is large. For example, when setting the value of $\alpha $ and $\beta $ as 0.4^{Footnote 2}, the constraint deployed on the intra-identity pair is adjustable for each triplet in triplet ratio loss, but rigid in triplet loss. More specifically, for the triplet in Fig. 1(b) where $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) = 0.5$, the triplet loss requires $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \le 0.1$; while the triplet ratio loss only requires $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \le 0.2$; therefore it relaxes the constraint on the intra-identity pair. Besides, for the triplet in Fig. 1(c) that $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) = 1$, the triplet ratio loss requires $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \le 0.4$. This constraint is tightened compared with that of triplet loss that requires $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) \le 0.6$.
Addressing the gradients issue. During the training stage, the triplet ratio loss provides $\textbf{f}_{i}^{n}$ a more reasonable repelling gradient. As illustrated in Eq. (4), the amplitude of repelling gradient for $\textbf{f}_{i}^{n}$ is inversely related to $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $. Therefore, as shown in Fig. 3, the amplitude of repelling gradient for $\textbf{f}_{i}^{n}$ becomes reasonably significant when $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $ is small, which is coherent to the intuition. In opposite, when $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $ is large that has almost satisfied the optimization purpose, the repelling gradient for $\textbf{f}_{i}^{n}$ becomes slight so that weak optimization is employed. This is intuitive since $\textbf{f}_{i}^{n}$ does not need much optimization in this situation. However, the triplet loss assigns counter-intuitive repelling gradient for $\textbf{f}_{i}^{n}$ as presented in Fig. 3.

Besides, the gradients for both $\textbf{f}_{i}^{n}$ and $\textbf{f}_{i}^{p}$ are also become more reasonable in triplet ratio loss. As illustrated in Eq. (4), gradient for $\textbf{f}_{i}^{n}$ is determined by both $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $ and $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) $, gradient for $\textbf{f}_{i}^{p}$ is determined by $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $. More specifically, first, the amplitude of gradient for $\textbf{f}_{i}^{n}$ is proportional to $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{p} \right) $. This means that the attention on $\textbf{f}_{i}^{n}$ will not be significant if the intra-identity pedestrian images are similar to each other. Second, the amplitude of gradient for $\textbf{f}_{i}^{p}$ is inversely proportional to $D\left( \textbf{f}_{i}^{a}, \textbf{f}_{i}^{n} \right) $. Therefore, a triplet where the pedestrian images in the inter-identity pair are obviously dissimilar will not put emphasis on optimizing $\textbf{f}_{i}^{p}$.

During the training stage, the ReID model is optimized with both cross-entropy loss and triplet ratio loss, the overall objective function can be therefore written as follows:

$$\begin{aligned} \mathcal {L} = \mathcal {L}_\mathrm{{CE}} + \lambda \mathcal {L}_\mathrm{{tri\_ratio}}. \end{aligned}$$

(5)

Here $\mathcal {L}_\mathrm{{CE}}$ represents the cross-entropy loss, $\lambda $ denotes the weight of triplet ratio loss and is empirically set to 1.

4 Experiment

We evaluate the effectiveness of the proposed triplet ratio loss on four popular ReID benchmarks, i.e., Market-1501 [45], DukeMTMC-reID [47], CUHK03 [16], and MSMT17 [35]. We follow the official evaluation protocols for these datasets and report the Rank-1 accuracy and mean Average Precision (mAP).

Market-1501 contains 12,936 images of 751 identities in the training dataset and 23,100 images of 750 identities in the test dataset. DukeMTMC-ReID consists of 16,522 training images of 702 identities and 19,889 testing images of 702 identities. CUHK03 includes 14,097 pedestrian images of 1,467 identities. The new training/testing protocol detailed in [49] is adopted. MSMT17 is divided into a training set containing 32,621 images of 1,041 identities, and a testing set comprising 93,820 images of 3,060 identities.

4.1 Implementation Details

Experiments are conducted using the PyTorch framework. During the training stage, both offline and online strategies are adopted for data augmentation [16]. The offline translation is adopted and each training set is enlarged by a factor of 5. Besides, the horizontal flipping and the random erasing [50] with a ratio of 0.5 are utilized. All images mentioned above are resized to $384 \times 128$.

ResNet50 [9] trained with only cross-entropy is used as the baseline. ImageNet is used for pretrain. In order to sample triplets for the triplet ratio loss, we set P to 6 and A to 8 to construct a batch (whose size is therefore 48). The value of $\beta $ is set as 0.4 for CUHK03 and 0.7 for the other three datasets. The standard stochastic gradient descent (SGD) optimizer with a weight decay of $5 \times 10^{-4}$ and momentum of 0.9 is utilized for model optimization. All the models are trained in an end-to-end fashion for 70 epochs. The learning rate is initially set to 0.01, then multiplied by 0.1 for every 20 epochs.

4.2 Impact of the Hyper-Parameter $\beta $

In this experiment, we evaluate the performance of triplet ratio loss with different value of $\beta $ on Market-1501 and CUHK03-Label. The other experimental settings are consistently kept to facilitate the clean comparison.

From the experimental results illustrated in Fig. 4, we can make the following two observations. First, the performance of the triplet ratio loss tends to be better when the value of $\beta $ increases; this is because a small value of $\beta $ leads to too strict constraint on intra-identity distances. Second, the performance of the triplet ratio loss drops when the value of $\beta $ further increase. This is because a large value of $\beta $ brings loose constraints on intra-identity distances; therefore it harms the intra-identity compactness.

4.3 Triplet Ratio Loss vs. Triplet Loss

We show the superiority of the proposed triplet ratio loss over triplet loss by comparing them from both quantitative and qualitative perspectives.

We compare the performance of triplet ratio loss with that of triplet loss under different value of $\alpha $ (the margin of triplet loss) in Fig. 5. It is concluded that the triplet ratio loss consistently outperforms triplet loss in both Rank-1 accuracy and mAP. For example, the triplet ratio loss beats the best performance of triplet loss by 0.9% in Rank-1 accuracy and 1.6% in mAP on Market-1501. The above experiments demonstrate the superiority of triplet ratio loss. Besides, the experimental results listed in Table 1 show that triplet ratio loss brings consistent performance improvement for the baseline: in particular, the Rank-1 accuracy is improved by 11.0% while mAP is also promoted by 13.8% on MSMT17. These experimental results verify the effectiveness of the proposed triplet ratio loss.

Table 1. Evaluation of triplet ratio loss on ResNet50. R-1 in Table 1 and Table 2 denotes the Rank-1 accuracy.

Full size table

The conclusion is further supported by visualizing the features optimized by triplet ratio loss and triplet loss in Fig. 6, as well as the curves associated with the cross-entropy loss when adopting the two losses in Fig. 7. After assessing Fig. 6 and Fig. 7, we can make the following observations: first, the features optimized with triplet ratio loss are more compact than that learned using triplet loss; this indicates the triplet ratio loss is more effective on addressing the intra-pair variations. Second, the cross-entropy loss with triplet ratio loss converges faster than that with triplet loss. This is because triplet ratio loss improves optimization procedure via providing more reasonable gradients. The above analyses prove the superiority of triplet ratio loss.

4.4 Comparison with State-of-the-Art

We compare the proposed triplet ratio loss with state-of-the-art methods on Market-1501 [45], DukeMTMC-ReID [47], CUHK03 [16], and MSMT17 [35]. For fair comparison, we divide existing approaches into two categories, i.e., holistic feature-based (HF) methods and part feature-based (PF) methods.

After examining the results tabulated in Table 2, we can make the following observations. First, compared with the PF-based methods, equipping ResNet-50 with the proposed triplet ratio loss achieves comparative performance though PF-based methods extract fine-grained part-level representations. Second, equipping ResNet-50 with the proposed triplet ratio loss also achieves comparative performance when compared with the state-of-the-art HF-based methods. For example, the Rank-1 accuracy of our method is the same as that of 3DSL [2], a most recent method that requires additional 3D clues, with the mAP of our method is merely lower than that of 3DSL by 0.6% on Market-1501. Third, compared with works that explore loss functions for ReID, the proposed method achieves the best performance when using the same backbone for feature extraction. Specifically, the proposed triplet ratio loss outperforms Circle loss [26] by 3.0% and 5.0% in terms of the Rank-1 accuracy and mAP, respectively, on the MSMT17 benckmark. At last, the Re-ranking [49] further promotes the performance of triplet ratio loss: the triplet ratio loss finally achieve 95.8% and 93.6%, 91.6% and 88.7%, 83.7% and 83.9%, 85.8% and 85.6%, 83.5% and 71.3% in terms of the Rank-1 accuracy and mAP, respectively, on each dataset. The above comparisons justify the effectiveness of triplet ratio loss.

Table 2. Performance comparisons on four popular benchmarks. “-” represents these results are not available.

Full size table

5 Conclusion

In this paper, we propose a novel triplet ratio loss to address two inherent problems: 1) heavily influenced by intra-pair variations and 2) unreasonable gradients, that associated with triplet loss. More specifically, first, the triplet ratio loss directly optimizes the ratio of intra-identity distance to inter-identity distance, therefore the margin between intra-identity distance and inter-identity distance could be adaptively adjusted according to respective triplet. Second, the triplet ratio loss adjusts the optimization gradients for embeddings considering both the inter-identity distances and intra-identity distances. The experimental results on four widely used ReID benchmarks have demonstrated the effectiveness and superiority of the proposed triplet ratio loss.

Notes

1.
Repelling gradient denotes the gradient that pushes the features away from each other, while attracting gradient indicates the gradient that pulls the features closer.
2.
0.4 is an empirical value for both $\alpha $ [31] and $\beta $. Please refer to Sect. 4.2 for extensive evaluation on the value of $\beta $.

References

Chen, B., Deng, W., Hu, J.: Mixed high-order attention network for person re-identification. In: ICCV, pp. 371–381 (2019)
Google Scholar
Chen, J., et al.: Learning 3D shape feature for texture-insensitive person re-identification. In: CVPR, pp. 8146–8155 (2021)
Google Scholar
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: CVPR, pp. 1335–1344 (2016)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, pp. 539–546 (2005)
Google Scholar
Dai, Z., Chen, M., Gu, X., Zhu, S., Tan, P.: Batch dropblock network for person re-identification and beyond. In: ICCV, pp. 3690–3700 (2019)
Google Scholar
Ding, C., Wang, K., Wang, P., Tao, D.: Multi-task learning with coarse priors for robust part-aware person re-identification. TPAMI 44(3), 1474–1488 (2022)
Article Google Scholar
Fang, P., Zhou, J., Roy, S.K., Petersson, L., Harandi, M.: Bilinear attention networks for person retrieval. In: ICCV, pp. 8029–8038 (2019)
Google Scholar
Ha, M.L., Blanz, V.: Deep ranking with adaptive margin triplet loss. arXiv preprint arXiv:2107.06187 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)
Ho, K., Keuper, J., Pfreundt, F.J., Keuper, M.: Learning embeddings for image clustering: an empirical study of triplet loss approaches. In: ICPR, pp. 87–94 (2021)
Google Scholar
Hou, R., Chang, H., Ma, B., Huang, R., Shan, S.: BiCnet-TKS: learning efficient spatial-temporal representation for video person re-identification. In: CVPR, pp. 2014–2023 (2021)
Google Scholar
Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., Chen, X.: Interaction-and-aggregation network for person re-identification. In: CVPR, pp. 9317–9326 (2019)
Google Scholar
Huang, T., Qu, W., Zhang, J.: Continual representation learning via auto-weighted latent embeddings on person ReID. In: PRCV, pp. 593–605 (2021)
Google Scholar
Li, J., Zhang, S., Tian, Q., Wang, M., Gao, W.: Pose-guided representation learning for person re-identification. TPAMI 44(2), 622–635 (2022)
Article Google Scholar
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: deep filter pairing neural network for person re-identification. In: CVPR, pp. 152–159 (2014)
Google Scholar
Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2, p. 7 (2016)
Google Scholar
Liu, X., Yu, L., Lai, J.: Group re-identification based on single feature attention learning network (SFALN). In: PRCV, pp. 554–563 (2021)
Google Scholar
Luo, C., Chen, Y., Wang, N., Zhang, Z.: Spectral feature transformation for person re-identification. In: ICCV, pp. 4975–4984 (2019)
Google Scholar
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: CVPR W (2019)
Google Scholar
Qian, X., Fu, Y., Xiang, T., Jiang, Y.G., Xue, X.: Leader-based multi-scale attention deep architecture for person re-identification. TPAMI 42(2), 371–385 (2020)
Article Google Scholar
Quan, R., Dong, X., Wu, Y., Zhu, L., Yang, Y.: Auto-ReID: searching for a part-aware convnet for person re-identification. In: ICCV, pp. 3749–3758 (2019)
Google Scholar
Rao, Y., Chen, G., Lu, J., Zhou, J.: Counterfactual attention learning for fine-grained visual categorization and re-identification. In: ICCV, pp. 1025–1034 (2021)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)
Google Scholar
Shu, X., Yuan, D., Liu, Q., Liu, J.: Adaptive weight part-based convolutional network for person re-identification. Multimedia Tools Appl. 79, 23617–23632 (2020). https://doi.org/10.1007/s11042-020-09018-x
Article Google Scholar
Sun, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: CVPR, pp. 6398–6407 (2020)
Google Scholar
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: ECCV, pp. 480–496 (2018)
Google Scholar
Tao, D., Guo, Y., Yu, B., Pang, J., Yu, Z.: Deep multi-view feature learning for person re-identification. TCSVT 28(10), 2657–2666 (2017)
Google Scholar
Wang, F., Xiang, X., Cheng, J., Yuille, A.L.: Normface: L2 hypersphere embedding for face verification. In: ACM MM, pp. 1041–1049 (2017)
Google Scholar
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: ACM MM, pp. 274–282 (2018)
Google Scholar
Wang, K., Ding, C., Maybank, S.J., Tao, D.: CDPM: convolutional deformable part models for semantically aligned person re-identification. TIP 29, 3416–3428 (2019)
MATH Google Scholar
Wang, K., Wang, P., Ding, C., Tao, D.: Batch coherence-driven network for part-aware person re-identification. TIP 30, 3405–3418 (2021)
Google Scholar
Wang, P., Ding, C., Shao, Z., Hong, Z., Zhang, S., Tao, D.: Quality-aware part models for occluded person re-identification. arXiv preprint arXiv:2201.00107 (2022)
Wang, W., Pei, W., Cao, Q., Liu, S., Lu, G., Tai, Y.W.: Push for center learning via orthogonalization and subspace masking for person re-identification. TIP 30, 907–920 (2020)
Google Scholar
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR, pp. 79–88 (2018)
Google Scholar
Wu, C.Y., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: ICCV, pp. 2840–2848 (2017)
Google Scholar
Wu, Y., Lin, Y., Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning. In: CVPR, pp. 5177–5186 (2018)
Google Scholar
Ye, M., Lan, X., Leng, Q., Shen, J.: Cross-modality person re-identification via modality-aware collaborative ensemble learning. TIP 29, 9387–9399 (2020)
MATH Google Scholar
Ye, M., Li, J., Ma, A.J., Zheng, L., Yuen, P.C.: Dynamic graph co-matching for unsupervised video-based person re-identification. TIP 28(6), 2976–2990 (2019)
MathSciNet MATH Google Scholar
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)
Yu, B., Liu, T., Gong, M., Ding, C., Tao, D.: Correcting the triplet selection bias for triplet loss. In: ECCV, pp. 71–87 (2018)
Google Scholar
Yu, S., et al.: Multiple domain experts collaborative learning: multi-source domain generalization for person re-identification. arXiv preprint arXiv:2105.12355 (2021)
Zhang, Z., Lan, C., Zeng, W., Chen, Z.: Densely semantically aligned person re-identification. In: CVPR, pp. 667–676 (2019)
Google Scholar
Zhao, X., Qi, H., Luo, R., Davis, L.: A weakly supervised adaptive triplet loss for deep metric learning. In: ICCV W (2019)
Google Scholar
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: ICCV, pp. 1116–1124 (2015)
Google Scholar
Zheng, M., Karanam, S., Wu, Z., Radke, R.J.: Re-identification with consistent attentive siamese networks. In: CVPR, pp. 5735–5744 (2019)
Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: ICCV, pp. 3754–3762 (2017)
Google Scholar
Zhong, Y., Wang, X., Zhang, S.: Robust partial matching for person search in the wild. In: CVPR, pp. 6827–6835 (2020)
Google Scholar
Zhong, Z., Zheng, L., Cao, D., Li, S.: Re-ranking person re-identification with k-reciprocal encoding. In: CVPR, pp. 1318–1327 (2017)
Google Scholar
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: AAAI, pp. 13001–13008 (2020)
Google Scholar
Zhou, K., Yang, Y., Cavallaro, A., Xiang, T.: Omni-scale feature learning for person re-identification. In: ICCV, pp. 3701–3711 (2019)
Google Scholar
Zhou, Z., Li, Y., Gao, J., Xing, J., Li, L., Hu, W.: Anchor-free one-stage online multi-object tracking. In: PRCV, pp. 55–68 (2020)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant U2013601, and the Program of Guangdong Provincial Key Laboratory of Robot Localization and Navigation Technology, under Grant 2020B121202011 and Key-Area Research and Development Program of Guangdong Province, China, under Grant 2019B010154003.

Author information

Authors and Affiliations

UBTech Robotics Corp Ltd, Shenzhen, China
Shuping Hu, Kan Wang, Huan Tan & Jianxin Pang
SIAT, Chinese Academy of Sciences, Beijing, China
Kan Wang
Institute for Infocomm Research, A*STAR, Singapore, Singapore
Jun Cheng

Authors

Shuping Hu
View author publications
You can also search for this author in PubMed Google Scholar
Kan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Huan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Jianxin Pang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianxin Pang .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi’an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, S., Wang, K., Cheng, J., Tan, H., Pang, J. (2022). Triplet Ratio Loss for Robust Person Re-identification. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-18907-4_4
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18906-7
Online ISBN: 978-3-031-18907-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Triplet Ratio Loss for Robust Person Re-identification